1 / 16

SEL3053: Analyzing Geordie Lecture 15. Hypothesis formulation

SEL3053: Analyzing Geordie Lecture 15. Hypothesis formulation. Lecture 14 applied three different varieties of hierarchical cluster analysis to the DECTE data matrix M. In this lecture we will 1. Formulate a hypothesis in answer to the research question

albin
Download Presentation

SEL3053: Analyzing Geordie Lecture 15. Hypothesis formulation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SEL3053: Analyzing GeordieLecture 15. Hypothesis formulation Lecture 14 applied three different varieties of hierarchical cluster analysis to the DECTE data matrix M. In this lecture we will 1. Formulate a hypothesis in answer to the research question 2. Say why the approach to sociolinguistic hypothesis formulation proposed in this module is intrinsically superior to traditional methods in Arts and Humanities research generally and in corpus-based linguistics more specifically.

  2. SEL3053: Analyzing GeordieLecture 15. Hypothesis formulation 1. Hypothesis formulation 1.1Phonetic usage analysis The results from lecture 14 are repeated in figure 1 for convenience of reference.

  3. SEL3053: Analyzing GeordieLecture 15. Hypothesis formulation 1. Hypothesis formulation 1.1Phonetic usage analysis All three trees agree on the following: (tlsn1, tlsn2)form a cluster that is strongly distinguished from the cluster of all the other speakers. (tlsg01, tlsg40, tlsg03), (tlsg05, tlsg09, tlsg01, tlsg10) and (tlsg02, tlsg13, tlsg24) form moderately-distinctive subclusters.

  4. SEL3053: Analyzing GeordieLecture 15. Hypothesis formulation 1. Hypothesis formulation 1.1Phonetic usage analysis They disagree on where to place following tlsg08, tlsg10, and tlsg13

  5. SEL3053: Analyzing GeordieLecture 15. Hypothesis formulation 1. Hypothesis formulation 1.1Phonetic usage analysis It can, therefore, be concluded that there is broad agreement among all three analyses on the structure of differences in phonetic usage among the twelve speakers, but that there are some differences which need to be investigated.

  6. SEL3053: Analyzing GeordieLecture 15. Hypothesis formulation Hypothesis formulation 1.2 Correlation with social data The social data relating to the DECTE / TLS speakers is found in the <profileDesc> section of the header in each interview, as described in an earlier lecture. The social data for our twelve speakers is shown opposite.

  7. SEL3053: Analyzing GeordieLecture 15. Hypothesis formulation Hypothesis formulation 1.2 Correlation with social data The most obvious correlation is between the place of residence of the speakers and the cluster analyses: tlsn01 and tlsn02 are from Newcastle and all the others are from Gateshead. In the sample selected, therefore, Newcastle and Gateshead speakers are very strongly distinguished in terms of their phonetic usage. No social data apart from place of residence survives for the Newcastle speakers, so the search for further correlations between phonetic usage and the social data focuses on the Gateshead speakers.

  8. SEL3053: Analyzing GeordieLecture 15. Hypothesis formulation Hypothesis formulation 1.2 Correlation with social data One of the trees we looked at earlier is shown below, with social information added.

  9. SEL3053: Analyzing GeordieLecture 15. Hypothesis formulation Hypothesis formulation 1.2 Correlation with social data Some observations: Gender has the most obvious correlation with cluster structure: All the trees agree in clustering the male (tlsg02, tlsg24, tlsg13) against the remaining seven female speakers.

  10. SEL3053: Analyzing GeordieLecture 15. Hypothesis formulation Hypothesis formulation 1.2 Correlation with social data Age shows no obvious correlation among the males (given the small number of them this is unsurprising), but there is a correlation in all the trees for the females: the older ones (tlsg01, tlsg40,tlsg03) cluster against the younger ones (tlsg10, tlsg08, tlsg05, tlsg09).

  11. SEL3053: Analyzing GeordieLecture 15. Hypothesis formulation Hypothesis formulation 1.2 Correlation with social data Education shows a moderate correlation: most of the speakers have minimal education, but the two females with day-release level (tlsg05, tlsg09) cluster in all three trees.

  12. SEL3053: Analyzing GeordieLecture 15. Hypothesis formulation Hypothesis formulation 1.3 Hypothesis We are now in a position to answer the research question: Is there systematic phonetic variation in the Tyneside speech community as represented by DECTE, and , if so, does that variation correlate systematically with social variables? The hypothesis is: There is systematic phonetic variation in the Tyneside speech community as represented by DECTE, and that variation correlates with social variables: Newcastle speakers differ strongly from Gateshead ones in their phonetic usage. Among Gateshead speakers, the main correlation is between gender and phonetic usage. Among female Gateshead speakers the main correlation is with age, though there are moderate correlations with educational level.

  13. SEL3053: Analyzing GeordieLecture 15. Hypothesis formulation 2. Why the approach to hypothesis generation proposed in this module is intrinsically superior to traditional methods In Arts and Humanities disciplines, hypothesis generation and testing has traditionally been based on the familiarity of the individual researcher with the domain of interest. For example: In historical studies the historian spent many years getting to know the documentation relevant to his period and area of interest. In literary studies the scholar did the same for literary materials. In linguistics the linguist did the same for historical and contemporary samples of linguistic usage.

  14. SEL3053: Analyzing GeordieLecture 15. Hypothesis formulation 2. Why the approach to hypothesis generation proposed in this module is intrinsically superior to traditional methods There is a fundamental problem with this traditional approach: it lacks the key scientific attributes of objectivity and replicability.

  15. SEL3053: Analyzing GeordieLecture 15. Hypothesis formulation 2. Why the approach to hypothesis generation proposed in this module is intrinsically superior to traditional methods Objectivity: It is a commonplace of philosophy in general and of philosophy of science in particular that there can be no truly objective observation of the world, and the present module is fully committed to that position. There are, however, degrees of objectivity, and science tries to maximize objectivity by using methods which are general, that is, applicable across a range of domains and therefore not amenable to any individual researcher's presuppositions about his or her data. Replicability: Research results in science must be amenable to peer review, which means that scientists other than those who produced the results must be in a position to repeat the relevant experiments and get the same results. The methodology used in science allows for this.

  16. SEL3053: Analyzing GeordieLecture 15. Hypothesis formulation 2. Why the approach to hypothesis generation proposed in this module is intrinsically superior to traditional methods The methodology used in this module has those attributes: the proposed data creation, representation, transformation, and clustering methods are both objective in the above sense, and allow the analysis presented in the foregoing lectures to be replicated. Traditional methods, on the other hand, lack them: they are essentially subjective and non-replicable. The conclusion is that the approach to hypothesis generation presented in this module is the way forward for corpus-based linguistics.

More Related