1 / 60

ONE DAY WORKSHOP ON BASIC RESEARCH METHODOLOGY 17 th December 2016

ONE DAY WORKSHOP ON BASIC RESEARCH METHODOLOGY 17 th December 2016. Prof. Prakash Kumar Retd.Jr.Scientist -cum-Asst. Prof. Dept. of Agril . Statistics Birsa Agricultural University,Ranchi. Objective of the Workshop. Participants are expected to:

teter
Download Presentation

ONE DAY WORKSHOP ON BASIC RESEARCH METHODOLOGY 17 th December 2016

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ONE DAY WORKSHOPONBASIC RESEARCH METHODOLOGY17th December 2016 Prof. Prakash Kumar Retd.Jr.Scientist-cum-Asst. Prof. Dept. of Agril. Statistics Birsa Agricultural University,Ranchi

  2. Objective of the Workshop Participants are expected to: • Understand the basic notion of statistics in research • Learn various statistical techniques used to analyze data • Be able to interpret results and draw conclusion • Learn the tools used in the analysis of data

  3. Research Design and Methodology • Research is the process of investigating scientific questions • Steps in Research process- • Defining the problem and conceptualizing the study • Designing and conducting study • Collecting data • Analyzing data • Making sense of data

  4. Statistics Statistics is science of data collection,summarization,analysis and interpretation.

  5. DATA COLLECTION Statistical Data can be Primary or Secondary. Primary data are those which are collected for the first time either by conducting survey or experiment and thus original in character. Whereas secondary data are those which have already been collected by some other persons and which have passed through the statistical process at least once.

  6. SUMMARISATION OF DATA When data are collected by any survey or experiments, those primary data are in the shape of raw materials to which statistical methods are to be applied for the analysis and interpretation. The data contained in schedules and questionnaires are not directly fit for analysis and interpretation. So it is essential that those data are made available in condensed form for further treatment. For this purpose data are divided and classified in homogeneous groups and then tabulated as per needs. Types of classifications : 1.Geographical 2.Chronological 3. Qualitative 4. Quantitative • Types of Tabulations: • Simple and Complex Tabulation • a. One way table • b. Two way table • c. Three way table • d. Higher Order table • 2. General and Special Purposes table

  7. A Taxonomy of Statistics

  8. How does statistics help us? Ages (in month) of the 60 patients in our data set 1 are- 71, 127, 65, 82, 140, 53, 114, 56, 84, 65, 67, 134, 64, …., 91, 51 By simply looking at the data, we fail to produce any informative account to describe the data, how ever, statistics produce a quick insight in to data using graphical and numerical statistical tools

  9. Statistical Description of Data • Statistics describes a numeric set of data by its • Center (mean, median, mode etc) • Variability (standard deviation, range etc) • Shape (skewness, kurtosis etc) • Statistics describes a categorical set of data by • Frequency, percentage or proportion of each category

  10. Statistical Inference • Statistical Inference • Sample Population • Statistical inference is the process by which we acquire information about populations from samples.

  11. Parameter vs Statistics • Parameter: • Any statistical characteristic of a population. • Population mean, population median, population standard deviation are examples of parameters. • Parameter describes the distribution of a population • Parameters are fixed and usually unknown

  12. Parameter vs Statistics • Statistic: Any statistical characteristic of a sample. • Sample mean, sample median, sample standard deviation are some examples of statistics. • Statistic describes the distribution of population • Value of a statistic is known and is varies for different samples • Are used for making inference on parameter

  13. Parameter vs Statistics • Statistical Issue: To describe the distribution of a population through census or making inference on population distribution/ population parameter using sample distribution/ statistic. • E.g. sample mean is an estimate of the population mean

  14. Variable- any characteristic of an individual or entity. A variable can take different values for different individuals. • Distribution - (of a variable) tells us what values the variable takes and how often it takes these values. • Unimodal - having a single peak • Bimodal - having two distinct peaks • Symmetric - left and right half are mirror images.

  15. FREQUENCY DISTRIBUTION

  16. DIAGRAMATIC PRESENTATION OF DATA Graphical Presentation: We look for the overall pattern and for striking deviations from that pattern. Over all pattern usually described by shape, center, and spread of the data. An individual value that falls outside the overall pattern is called an outlier. Bar diagram , Pie charts ,Histogram, stem and leaf and Box-plot etc are used variables.

  17. Descriptive Statistics After collecting data we go for its classifications, tabulations and also for its graphical or diagrammatical presentation. These measures help in reducing the size of the data and enable us to make some comparative studies of related variables But these derivatives are not enough. Hence we further wish to reduce the complexicity of data and to make them comparable. For this we apply various statistical tools available such as Measures of Central Tendencies, Measures of dispersion etc.

  18. MEASURESS OF CENTRAL TENDEENCIES

  19. MEAN, MEDIAN AND MODE

  20. MEASURES OF DISPERSION The mean, median and mode are measures of central tendencies of a variable, but they do not provide any information of how much the measurements vary or are spread..Measures of dispersion or variation are another statistical tool to estimate the homogeneity or heterogeneity in the data set. There are different measures of dispersion such as Range, Mean Deviation, Standard Deviation, Variance and Standard Error . Whereas another measure of dispersion i.e. Coefficient of Variation( expressed in %) is also available which is a relative measure of dispersion.

  21. Range : Mean Deviation:

  22. Standard Deviation:

  23. Variance : Variance = (S.D.)2 Standard Error : It is given by S.E. = S.D./ √n Coefficient of Variation (%) : C.V.(%) = S.E. * 100/ Grand Mean

  24. Hypothesis Testing/ Test of Significance :

  25. Hypothesis : Hypothesis means a statement. It can be of two types : Statistical Hypothesis and Non statistical hypothesis. In Statistical Hypothesis, some statistical parameters such as mean or variance are involved whereas in non statistical hypothesis no statistical parameters are involved. Null Hypothesis is represented by H0 whereas alternate hypothesis is represented by H1. Example of statistical hypothesis is H0 : µ1= µ2 H1: µ1 ≠ µ2 Example of non statistical hypothesis is H0: The defendant is innocent. H1:The defendant is guilty.

  26. Null Hypothesis Statement about the value of a population parameter Represented by H0 Always stated as an Equality Alternative Hypothesis Statement about the value of a population parameter that must be true if the null hypothesis is false Represented by H1 Stated in on of three forms > <  Null Hypothesis vs. Alternative Hypothesis

  27. Steps in hypothesis Hypothesis testing steps: 1. Null (Ho) and alternative (H1)hypothesis specification 2. Selection of significance level (alpha) - 0.05 or 0.01 3. Calculating the test statistic –e.g. t, F, Chi-square 4. Calculating the probability value (p-value) or confidence Interval? 5. Describing the result and statistic in an understandable way.

  28. Type I and Type II Error

  29. Paired and Unpaired Tests:

  30. Selection of appropriate Tests :

  31. t-test

  32. Paired t-test

  33. Z -test

  34. VARIANCE RATIO TEST i.e. F - TEST

  35. CHI SQUARE TEST The chi square distribution is a theoretical or mathematical distribution which has wide applicability in statistical work. The term `chi square' (pronounced with a hard `ch') is used because the Greek letter X2 is used to define this distribution. It will be seen that the elements on which this distribution is based are squared, so that the symbol Â2 is used to denote the distribution. For both the goodness of fit test and the test of independence, the chi square statistic is the same. For both of these tests, all the categories into which the data have been divided are used. The data obtained from the sample are referred to as the observed numbers of cases. These are the frequencies of occurrence for each category into which the data have been grouped. In the chi square tests, the null hypothesis makes a statement concerning how many cases are to be expected in each category if this hypothesis is correct. The chi square test is based on the difference between the observed and the expected values for each category. The chi square statistic is defined as where Oiis the observed number of cases in category i, and Eiis the expected number of cases in category i.

  36. CORRELATION

  37. REGRESSION

  38. REGRESSION

  39. Introduction to Experimental design The study of strategies for efficient plans for the collection of data, which lead to proper estimates of parameters relevant to the researcher’s objective, is known as experimental design. A properly designed experiment, for a particular research objective, is the basis of all successful experiments. There are three basic principles of experimental design. These are 1. Replication 2. Randomization 3. Control

  40. Terminology of experimental design • An experiment is a planned inquiry to obtain new knowledge or to confirm or deny previous results. • 2. The population of inference is the set of ALL entities to which the researcher intends to have the results of the experiment be applicable. • 3. The experimental unit is the smallest entity to which the treatment is applied and which is capable of being assigned a different treatment independently of other experimental units if the randomization was repeated. This is the most seriously misunderstood and incorrectly applied aspect of experimental design. • 4. A factor is a procedure or condition whose effect is to be measured.

  41. 5. A level of a factor is a specific manifestation of the factor to be included in the experiment. 6. Experimental error is a measure of the variation which exists among experimental units treated alike. 7. A factor level is said to be replicated if it occurs in more than one experimental unit in the experiment. 8. Randomization in the application of factor levels to experimental units occurs only if each experimental unit has equal and independent chance of receiving any factor level and if each experimental unit is subsequently handled independently.

  42. Replication Although randomization helps to insure that treatment groups are as similar as possible, the results of a single experiment, applied to a small number of objects or subjects, should not be accepted without question. To improve the significance of an experimental result, replication, the repetition of an experiment on a large group of treatments, is required. Replication reduces variability in experimental results, increasing their significance and the confidence level with which a researcher can draw conclusions about an experimental factor.

  43. Randomization Randomization involves randomly allocating the experimental units across the treatment groups. Thus, if the experiment compares new maize varieties against a check variety, which is used as a control, then new varieties are allocated to the plots, or experimental units, randomly. In the design of experiments, varieties are allocated to the experimental units not experimental units to the varieties. In experimental design we consider the treatments, or varieties, as data values or levels of a factor (or combination of factors) that are in our controlled study. Randomization is the process by which experimental units (the basic objects upon which the study or experiment is carried out) are allocated to treatments; that is, by a random process and not by any subjective and hence possibly biased approach. The treatments or varieties should be allocated to units in such a way that each treatment is equally likely to be applied to each unit. Because it is generally extremely difficult for experimenters to eliminate bias using only their expert judgment, the use of randomization in experiments is common practice. In a randomized experimental design, objects or individuals are randomly assigned (by chance) to an experimental group. Use of randomization is the most reliable method of creating homogeneous treatment groups, without involving any potential biases or judgments.

  44. Control of local variability – blocking In the design of experiments, blocking is the arrangement of experimental units into groups(blocks) that are similar to one another. For example, if an experiment is designed to test 36new maize varieties, the field in which the varieties are to be tested may not be uniform, having two fertility gradients – low and high soil fertility. The level of fertility may introduce a level of variability within the experiment; therefore, blocking is required to remove the effect of low and high fertility . This reduces sources of variability and thus leads to greater precision. For randomized block designs, one factor or variable is of primary interest. However, there are also several other nuisance factors. Blocking is used to reduce or eliminate the contribution of nuisance factors to experimental error. The basic concept is to create homogeneous blocks in which the nuisance factors are held constant while the factor of interest is allowed to vary. Within blocks, it is possible to assess the effect of different levels of the factor of interest without having to worry about variations due to changes of the block factors, which are accounted for in the analysis.

  45. The Completely Randomized Design (CRD) • The CRD is the simplest of all designs. It is equivalent to a t-test when only two treatments are examined. • Field marks: • Replications of treatments are assigned completely at random to independent experimental subjects. • Adjacent subjects could potentially have the same treatment. A1 B1 C1 A2 D1 A3 D2 C2 B2 D3 C3 B3 C4 A4 B4 D4

More Related