Corpus-Informed Teaching and Research II. Ken Lau. Going from Prescriptive to Descriptive.
PowerPoint Slideshow about ' Corpus-Informed Teaching and Research II' - gareth
An Image/Link below is provided (as is) to download presentation
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
Prescriptive grammar has their origins in 18th and 19th century Europe where grammar was connected to the idea that there was a relationship between ‘orderliness’ in speech and writing. Prescriptive grammar is often concerned with what is ‘correct’ and ‘incorrect’ – with “standard language” and dictionaries that record correct spelling and setting down precise meanings – all part of unifying emerging nation-states. We can see a similar tendency in China over the past 150 years, and Mandarin and Modern Standard Chinese acquired normative status as a national standard.
Many of the prescriptive rules of English that cause us most trouble today come from this period (e.g. Who did you speak to? or To whom did you speak?) and were based upon analogies with Latin rather than what English speakers actually said or wrote at the time.
Descriptive grammar has their origin in 20th-century linguistics and other disciplines that began to see language as a vital resource for studying all kinds of aspects of social and cultural life. Descriptive grammar grew out of a concern with the language forms people actually used. In some part, this approach was an anthropological one, as different language groups were studied in relation to European languages, but also in part as a wave of American missionaries set about translating the bible into languages that had no written language. The techniques of inducing grammatical rules from spoken data were learnt, but so also was a respect for the variability of language as a system for reflecting thought and relationships.
Out of this work grew the idea of ‘acceptability’ (for communication to succeed) rather than simply a concern for formal ‘correctness’. From a descriptive point of view, Who did you speak to? and To whom did you speak?are both acceptable sentences because both are used and both make sense. But To speak whom? did youis not acceptable because it is neither used, nor does it make sense.
The term ‘data-driven learning’ suggests that it is an inductive approach and therefore comparable with the implicit method, though the emphasis is on gaining insight rather than establishing habits, and in this sense it is mentalistic.
The approach makes high demands on the students in terms of language proficiency, observation and inductive reasoning. It is therefore more suitable for advanced language learners.
Indeed corpora provide authentic use of language. However, one question that people ask is whether corpora really capture reality? However large a corpus is, it is still not enough to capture all instances of language use of adult user’s experience
Carter suggests two such reasons: ‘double-Dutch’, ‘go Dutch’ and indeed, ‘Dutch cap’, could all be useful expressions for a learner wishing to avoid social embarrassment in Britain; and the study of British insularity, as revealed through linguistic references to foreign nationals and nations, could constitute a stimulating activity which could increase learners’ awareness of cultural issues.
The inclusion in syllabuses of language which is very rare in large corpora thus calls for justification, and the same is equally true for the exclusion of language which is common. As we saw with ‘real’, corpora can remind us of frequent uses which might otherwise tend to be ignored. Thus McCarthy and Carter (1995) notice the frequency in speech of the semi-modal ‘tend to’ (it occurs almost as often as ‘ought’ in the BNC spoken component). Although this verb has traditionally received little attention in teaching, it arguably provides learners with a valid alternative to frequency adverbs such as ‘usually’ and ‘often’.
A key endeavor in the production of corpus-based materials to aid students with academic writing of a general nature is that by Thusturn and Candlin (1998a, b). Moving from controlled to a more open-ended writing activities would seem to be inculcating in students the kind of ‘corpus competence’
In this corpus-derived material the lexico-grammar is introduced according to its specific rhetorical function, e.g. referring to the literature, reporting the research of others. Within each broad function, each keyword (e.g. argue, suggest) is then examined within the following chain of activities:
LOOK at concordances for the key term and words surrounding it, thinking of meaning (using for instance, BAWE)
FAMILIARIZE yourself with the patterns of language surrounding the key term by referring to the concordances as you complete the tasks.
PRACTISE key terms without referring to the concordances.
CREATE your own piece of writing using the terms studied to fulfill a particular function of academic writing
A variety of specialized corpora, consisting of lectures, engineering textbooks, legal essays and research articles, have been used for various types of pedagogic applications, which very often combine initial pen-and-paper awareness-raising activities with follow-up direct consultation of the corpus by students.
Jones and Schmitt (2010) devised discipline-specific vocabulary materials including both technical and colloquial terms, derived from corpora of academic seminars on language and gender, international law and entrepreneurship. Mudraya’s (2006) materials, based on a 2-million-word corpus of engineering textbooks, also targeted vocabulary, but of a sub-technical nature. Mudraya has noted that this type of vocabulary (i.e. those items such as current, solution, tensionwhich have some sense in general English, but are used in a different sense in technical English.
She proposes a set of queries based around solution on the grounds that this word occurs, in its general sense, both as high frequency word family and as a frequent sub-technical item. Students are presented with concordance output of carefully selected examples of solution and in one exercise are asked to identify, for example, the following: those adjectives used with solution (1) in the general sense and (2) in the technical (chemical) sense, and then asked to underline those adjectives that can be used with both senses of solution a means to highlight collocational sensitivities.
At HKU, a legal concordancer was created by the Centre for Applied English Studies of HKU http://www4.caes.hku.hk/lawvocab/tools/index.htm to help Law students improve the legal writing skills. One task that students have to do as part of the course Writing Solutions to Legal Problem is to have each student present the usage of a legal term of his/her choice by analyzing the concordancing lines.
Several pedagogic applications approach the corpus consultation from a genre-based perspective. Bhatia et al. propose various move-specific concordancing activities for one genre of legal English, the problem-question genre written by students within academic settings. They note that deductive reasoning plays a major role in this highly specialized genre. One of the major foci, therefore, is to have students examine various types of non-lexical epistemic and pragmatic/discoursal hedges for the role they play in the deductive reasoning.
Another advocate of a concordance- and genre-based approach to academic essay writing in the legal field, specifically formal legal essays written by undergraduates, is Weber (2001). First, Weber’s students were inducted into the genre of legal essays b reading through whole essays taken from the University of London LLB Examinations written by native speakers, and identifying some of the prototypical rhetorical features, e.g. identifying and/or delimiting the legal principle involved in the case. They were then asked to identify any lexical expressions which seemed to correlate with the genre features. This was followed up by consulting the corpus of the legal essays to verify and pinpoint regularities in lexico-grammatical expressions. Similar to those tasks proposed by Bhatia et. al., Weber also approaches the lexico-grammar from the perspective of a ‘local grammar’, which ‘attempts to describe the resources for only one set of meanings in a language rather than for the language as a whole’ (Hunston 2002: 90).
So far we have only looked at expert corpora. We should bear in mind that corpora containing texts from learners have high pedagogical and research value. Mukherjee and Rohrbach (2006) advocated individualizing writing by having students build mini-corpora of their own writing, and localising the database. A pedagogic initiative in which students compare a learner corpus of NNS MBA dissertation writing with a corpus of published journal articles from the field of Business Studies, both compiled by the teacher, is that by Hewings and Hewings (2002).
In spite of the potential advantages in integrating learner corpus data into pedagogy, Nesselhauf (2003) points out that care is needed in presenting learner corpus data to students, as does Mukherjee (2009: 213): ‘It is neither desirable or useful to establish a rigid dichotomy between good and correct usage in nature data on the one hand and bad an incorrect usage in learner output on the other’
In this short course, we have never touched upon bilingual corpora but their value to translation and language teaching and learning should not be underestimated. Both Teubert (2004) and Barlow (2000) emphasise that parallel corpora are especially useful for examining phraseological queries, with Barlow noting that frequency counts provide ‘a very good indication of the preferred structure in each language’. Frankenberg-Garcia (2005) shows the value of using concordancing output from a parallel corpus in preference to a bilingual dictionary as students can see the different contexts in which a word is used.
Key notions to be covered here would include the different types of corpora available (spoken, written, multimodal, etc.), corpus design, size and representativeness. Teachers need to know how to choose among different types of corpora for particular queries. Teachers would also be introduced to concordancing, a key analytical tool for corpus queries. Teachers need to know how to formulate different kinds of queries through specifying searches to the left and right of the node word and how to sort the concordance lines alphabetically.
Many other researchers point out that teachers’ IT competence, or lack thereof, and preference for more traditional resources are not to be taken lightly and that technological awareness is a key component of developing teachers’ corpus competence.
A necessary prerequisite for expert teaching is pedagogical content knowledge consisting of content knowledge (i.e. linguistic knowledge in the case of EFL teaching), pedagogic knowledge, and content specific teaching knowledge.
Pedagogic and content-specific teaching knowledge have both been addressed in corpus-based modules on teacher education programmes.
O’Keeffe and Farr (2003) outline a series of tasks for raising students’ awareness of pedagogic knowledge through analysis of corpus classroom data. It is of interest to note that they combine this aspect with raising teachers’ technological awareness and also content knowledge of discourse analysis by building hints on searching into the instructions, and by asking teachers to analyse the concordance output based on the classroom discourse model. They also point out that the corpus data chosen was from both expert and non-expert teachers to avoid equating inexperience with lack of expertise or vice versa.
Teaching with corpora to raise teachers’ linguistic awareness was first introduced in teacher education programmes in the mid 1990s, together with training in using corpora. These studies emphasise the benefits of corpus-based enquiries to focus on phraseological patterns or semantic information which may not be found in grammar books and dictionaries.