Thanks to Workshop Organizers. Thanks for the InvitationMy research area is StatisticsAdditional thanks to Dean, SPGS, Unilag. Presented by Felix Famoye. 4/10/2012. 2. Outline of the talk. IntroductionResearch TypesSub-sections of Research MethodologyData AnalysisConclusion/Final Comments. Pr
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
1. Research Methodology and Data Analysis Felix Famoye
Central Michigan University, USA
Currently a Fulbright Scholar, Unilag Presented by Felix Famoye 4/10/2012 1
2. Thanks to Workshop Organizers Thanks for the Invitation
My research area is Statistics
Additional thanks to Dean, SPGS, Unilag Presented by Felix Famoye 4/10/2012 2
3. Outline of the talk Introduction
Sub-sections of Research Methodology
Conclusion/Final Comments Presented by Felix Famoye 4/10/2012 3
4. Introduction The methodology section shows how your research questions will be answered.
It must be appropriate for your research type.
Describe in detail what methodology and materials, if any, that you will use to carry out your research.
This section may have some of the following sub-sections: Presented by Felix Famoye 4/10/2012 4
5. Some sub-sections: Conceptual and/or Theoretical Framework
Models and/or Theorems Formulation
Materials and Experiments
Data Collection Method
Sub-section’s choice depends on research type. 4/10/2012 Presented by Felix Famoye 5
6. Research Types Broadly speaking, we have qualitative and quantitative research studies.
These two broad methods have further been divided into different types.
Boundaries between the research types may not be that clear. Presented by Felix Famoye 4/10/2012 6
7. Qualitative method: In this approach, narrative data is collected in order to study the topic of interest.
It is also called ethnographic (investigating cultures) or anthropological research.
The data analysis includes coding and production of verbal synthesis.
No statistical procedures or other means of quantification is involved. Presented by Felix Famoye 4/10/2012 7
8. Types of Qualitative Research: Historical research, allows one to discuss past and present events. The method investigates the why and how of decision making. An example: Factors that led to the creation of more states (or more universities) in Nigeria.
Qualitative research is involved in the study of current events rather than past events. Examples: A case study of how students solve algebraic equations.
Presented by Felix Famoye 4/10/2012 8
9. Quantitative method: In this approach, data (both numerical and non-numerical) is collected in order to describe, predict and/or control phenomena of interest.
The data analysis is mainly statistical.
Quantitative research can be used to verify such hypotheses formulated through qualitative research. Consider the example on how students solve algebraic equations. Presented by Felix Famoye 4/10/2012 9
10. Quantitative research generally includes: Development of models, theories and hypotheses.
Development of instruments and methods to collect data.
Experimental control and manipulation of variables.
Collection of empirical data.
Modeling and analysis of data.
Evaluation of results. Presented by Felix Famoye 4/10/2012 10
11. Statistics is widely used in quantitative research.
Quantitative method can be divided into four types:
Presented by Felix Famoye 4/10/2012 11
12. Jokes: [http://www.btinternet.com/~se16/hgb/statjoke.htm] How many statisticians does it take to change a light bulb? 1-3, alpha = 0.05.
There is no truth to the allegation that statisticians are mean. They are just your standard normal deviates.
Did you hear about the statistician who invented a device to measure the weight of trees? It’s referred to as the log scale.
Did you hear about the statistician who was thrown in jail? He now has zero degrees of freedom.
4/10/2012 Presented by Felix Famoye 12
13. Sub-sections of Research Methodology Research/Experimental Design:
This depends on your research type.
Is it descriptive, correlational, causal-comparative or experimental research?
For example, one may want to compare two teaching methods. One possible design is to have three groups of subjects (method 1, method 2, and a control; pre/post tests). Presented by Felix Famoye 4/10/2012 13
14. Sampling Methods: For surveys or any research in which you plan to collect data, define your population.
What is the sampling design? (or How will you select your subjects?)
How many subjects will you select?
For example, to estimate proportion:
n = Npq/[(N – 1)B2/4 + pq], where N = population size, B = error bound, p = 0.5]
Presented by Felix Famoye 4/10/2012 14
15. Sampling Methods continued: To generalize your result, use a probability sampling method.
Among the probability sampling methods are
simple random sample; systematic random sample; stratified random sample; cluster sample.
Among the non-probability sampling methods are voluntary response sample; convenience sample.
Presented by Felix Famoye 4/10/2012 15
16. Measurement Instruments (and/or Materials): Are you using a survey designed by you or someone else? Give reference, if other(s).
Address the reliability of the instrument.
In the biological or medical sciences, address the materials that will be used.
Presented by Felix Famoye 4/10/2012 16
17. Data Collection Methods: If you do not have adequate training in this area (or in data analysis), seek help before you begin to collect your data.
Quite often, researchers collect inadequate data or data that are not properly recorded.
You want to be sure that the data you collect can be used to answer your questions or test your hypotheses. 4/10/2012 Presented by Felix Famoye 17
18. Data Collection Methods continued: Data from surveys and experiments are called primary data.
Data obtained from a source are called secondary data.
For examples, humanities, social sciences, public health, law and education are most likely to use surveys.
Also for examples, agriculture, physical and biological sciences, medical sciences are most likely to conduct experiments. Presented by Felix Famoye 4/10/2012 18
19. Some data collection methods are:
Self-administered questionnaires (Mailed or handed out, especially in convenience sample)
Presented by Felix Famoye 4/10/2012 19
20. Example of Model Developments sub-section: R-squared measures will be developed.
The R-squared measures will be adjusted for both the sample size and the number of independent variables.
Power-divergent statistics will be developed.
A detailed simulation study will be conducted to compare the log-likelihood ratio, R-squared, and the power-divergent statistics. Presented by Felix Famoye 4/10/2012 20
21. Data Analysis Will data be analyzed qualitatively or quantitatively?
The choice will depend on data collection methods and the sample size.
Describe the types of data analysis or modeling that will be done.
Address each research question by describing the type of statistical tests that will be performed.
Include the name of the software used.
Presented by Felix Famoye 4/10/2012 21
22. Data Cleaning: This is the process where you detect and correct the errors.
Some could be from typing errors during data entry or coding error.
For detection of errors-
Obtain descriptive statistics like frequency counts, minimum, maximum, means, range, and standard deviation. Obtain graphs like histogram or scatter plot.
Presented by Felix Famoye 4/10/2012 22
23. More Jokes: The only time a pie chart is appropriate is at a baker's convention.
Old statisticians never die, they just undergo a transformation.
How do you tell one bathroom full of statisticians from another? Check the p-value.
Did you hear about the statistician who made a career change and became a surgeon specializing in ob/gyn? His specialty was histerectograms.
4/10/2012 Presented by Felix Famoye 23
24. Data Types Generally speaking, statistical techniques are often determined based on the type of data.
The two major types of variables are qualitative and quantitative variables.
Qualitative variables: The data values are non-numeric categories. Measurement scales are
Nominal- data are non-numeric group labels
Ordinal- values are ranked categories
4/10/2012 Presented by Felix Famoye 24
25. Quantitative variables: The data values are counts or numerical measurements. It can be discrete/continuous.
The measurement scales are-
Interval- data values ranged in a real interval. The difference, but not the ratio, of two values is meaningful. Interval data has no absolute zero.
Ratio- Both the difference and ratio of two values are meaningful. 4/10/2012 Presented by Felix Famoye 25
26. Statistics (descriptive and inferential) Descriptive statistics (Numeric and Graphic):
These includes summary statistics (mean, median, standard deviation, frequency) and graphic tools (pie charts, bar charts, histograms, box plots, scatter plots)
For nominal data: Use frequency, crosstabs, bar charts and pie charts.
For ordinal data: Use frequency, crosstabs, summary statistics, bar charts and pie charts.
For continuous data: Use summary statistics, histograms, box plots, and scatter plots.
4/10/2012 Presented by Felix Famoye 26
27. Estimation and Tests (Inferential statistics): This is used to make comparisons between two or more groups or study relationships.
These include point estimation, confidence interval or interval estimation, and hypothesis testing.
4/10/2012 Presented by Felix Famoye 27
28. If you are interested in comparing group effects For nominal or ordinal data: Use crosstabs (chi-square)
For continuous data (First, check for normality):
For two group comparison, use independent t-test.
For three or more group comparison, use one-way analysis of variance (ANOVA).
For two or more factors, use multi-way ANOVA.
If there are factors and covariates, use analysis of covariance (ANCOVA).
If the same subject is measured more than one time, it is a paired t-test for two time periods and it is a repeated measure ANOVA for more than two periods.
4/10/2012 Presented by Felix Famoye 28
29. If you are interested in the relationship between two variables For nominal data, use crosstabs, and choose proper tests for nominal data.
For ordinal data, use crosstabs (chi-square test), bivariate correlation such as Spearman correlation coefficient.
For continuous data, use bivariate correlation such as Pearson correlation.
4/10/2012 Presented by Felix Famoye 29
30. If you are interested in modeling a response variable using predictor variables For nominal data, use Logistic regression model if the response is a binary variable (that is only two possible values such as yes or no). If the response has more than two categories, use multinomial logistic regression.
For count data, use Poisson regression model if the response follows a Poisson distribution. In general, one can use log-linear models for ordinal data.
For continuous data, use regression analysis.
4/10/2012 Presented by Felix Famoye 30
31. Assumptions: Most of statistical techniques require certain assumptions.
Typically, for continuous response, the assumptions may include:
Normality of the response variable.
Homogeneity of variance.
The relationship between Y and X’s is linear.
When assumptions do not hold, use transformation or a non-parametric method.
4/10/2012 Presented by Felix Famoye 31
32. Some Nonparametric Methods Chi-square tests
For two independent samples comparison, use Mann-Whitney U or Kolmogorov-Smirnov Z. This is similar to independent t-test.
For K independent samples comparison, use Kruskal-Wallis H or Median. This is similar to ANOVA.
For two related samples, use Wilcoxon or Sign test for quantitative data; McNemar for binary data and Marginal Homogeneity for multinomial data. This is similar to paired t-test.
For K related samples, use Friedman or Kendall’s W measure of agreement or Cochran’s Q for binary data. This is similar to Repeated Measure ANOVA.
4/10/2012 Presented by Felix Famoye 32
33. If you are interested in reducing the data dimension Use Cluster Analysis or Factor Analysis.
Cluster analysis can be applied to group variables or cases. The cluster analysis for variables will group the variables into small number of subsets of variables based on the similarity of cases.
Factor analysis combines similar variables together into a dimension that can be interpreted from the qualitative aspects of the study. 4/10/2012 Presented by Felix Famoye 33
34. Conclusions/Final Comments You bought an expensive clothing material.
Do you look for an apprentice tailor to sew the material for you?
You look for an experienced tailor who is very knowledgeable.
When you decide to seek help for your data collection and/or data analysis, you should not settle for less (anybody).
Look for someone with adequate training in statistical methodology.
Mathematics Dept Statistical Consulting Unit
Presented by Felix Famoye 4/10/2012 34
35. Conclusions/Final Comments continued: When test shows a significant effect, a common misunderstanding is that the hypothesis has been proven.
In a statistical test, if the outcome is inconsistent with the research hypothesis, then the hypothesis is rejected.
If the outcome is consistent with the research hypothesis, the data is said to support the hypothesis.
Hypothesis is never proven but rather only supported by the analyzed data.
4/10/2012 Presented by Felix Famoye 35
36. More Jokes: A statistician can have his head in an oven and his feet in ice, and he will say that on the average he feels fine.
Numbers are like people; torture them enough and they will tell you anything.
Statistics in the hands of an engineer are like a lamppost to a drunk-they are used more for support than illumination. (Bill Sangster, Dean of Engineering, Georgia Tech.)
The statistics on sanity are that one out of every four Nigerians is suffering from some form of mental illness. Think of your three best friends. If they are okay, then it is you. (Rita Mae Brown, for Americans)
4/10/2012 Presented by Felix Famoye 36
37. Thanks for your attention
This is the end of the presentation Presented by Felix Famoye 4/10/2012 37