1 / 41

LIN 3098 Corpus Linguistics

LIN 3098 Corpus Linguistics. Albert Gatt. In this lecture. We proceed with our discussion of how corpus-based studies influence the study of grammar. Focus: lexico-grammar. Uses of corpora in grammar studies. The use of corpora to study grammar is relatively recent.

nigel
Download Presentation

LIN 3098 Corpus Linguistics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. LIN 3098 Corpus Linguistics Albert Gatt

  2. In this lecture • We proceed with our discussion of how corpus-based studies influence the study of grammar. • Focus: lexico-grammar

  3. Uses of corpora in grammar studies • The use of corpora to study grammar is relatively recent. • With corpora, the unit of analysis tends to be the word (tokens/types) • Studies of lexis therefore a natural application. • The study of grammar has in fact emphasised the role of lexis. • Also aided by recent developments in automatic POS tagging and parsing. • Additional grammatical information enables search and analysis of complex structures.

  4. Part 1 The relationship between grammar and lexis

  5. Degrees of abstraction • We have already looked at the use of corpora in studying collocations. • Given sufficient grammatical annotation, we can look at collocational patterns at different degrees of abstraction.

  6. Degrees of abstraction • Example: all preceding collocates of the noun time in the BNC. • Not all collocates are equally interesting. • lots of noise when searching for a single word!

  7. Practical task 1 • Let’s try to make our search more interesting, by focusing on a combination of lexical and grammatical material. • Conduct a search for: • Any adjective followed by the noun time

  8. Degrees of abstraction • Example: only adjectival collocates of the noun time in the BNC. • Can make grammatically informed queries.  [ADJ + time] • Allows focus on what is truly of interest.

  9. Practical task 2 • We can go further in abstracting away from specific lexical material. • Conduct a search for: • Any adjective followed by any noun

  10. Degrees of abstraction • Suppose we were interested in all adjective-noun combinations.  [ADJ + N] • Given a query language of the right complexity (such as CQL), we can extract grammatically interesting collocations.

  11. Limitations of these approaches • What we’ve done still retains a focus on the word. • The main purpose is to improve lexical research by incorporating a limited amount of grammatical info (usually POS) • Can we go further and really investigate grammar?

  12. Part 2 Collocational Frameworks

  13. Does this sound familiar? • Colourless green ideas sleep furiously • Chomsky’s example illustrates an approach to syntax where: • the primary focus is on syntactic rules • rules manipulate lexical items of the right categories • “grammatical” or “legal” is distinct from “sensible” or “meaningful” • syntactic rules operate (semi-) independently of lexical items: if X is of the right category, then X can be slotted into a syntactic position

  14. Chicken and egg questions • When we formulate an utterance, which comes first? • syntax? • lexical items? • both in parallel? • Do particular syntactic constructions have a meaning (or communicative function)? E.g. what is the meaning of: • the appositive that-construction The reason that he gave was… • the extraposedit-construction It is possible to hire a car if you want one.

  15. Lexical approaches to grammar • Assumptions: • syntactic structures are highly sensitive to the lexical items that they can select • structures also may have specific communicative functions or meanings • speakers/authors convey meaning, and syntax is used as a resource to convey it • ideally, grammar+lexis should be viewed as part and parcel of the same process • phraseology and co-selection play an important role • in particular constructions, we find that particular words tend to co-occur with great regularity

  16. The idiom principle • Sinclair (1991): • “a language user has available to him or her a large number of semi-preconstructedphrases that constitute single choices, even though they might appear to be analyzable into segments”

  17. Implications • The idiom principle suggests that speakers/writers: • Don’t just apply abstract rules to build structures; • Re-use bits of structure; • It also implies that bits of structure are themselves meaningful.

  18. The idiom principle vs open choice • This principle contrasts with the “open-choice” principle. • Open choice predicts that: • Syntactic rules operate independently of lexical items. • Structures are constructed by applying rules and “plugging” in lexemes.

  19. Putting the idiom principle to work • Sinclair and Renouf (1991) introduced collocational frameworks • Intended as a practical way to investigate the use and meaning of grammatical constructions • A collocational framework consists of a pattern involving 3 items: • A function word • A content word (specified via POS) • Another function word • Example: [a + Noun + of]

  20. Collocational frameworks • Is a pattern like [a + Noun + of] a linguistic unit? If it is, we would expect that: • The grammatical context (a, of) makes restrictions on the semantics of the Noun in the middle (not any noun can be used)

  21. Practical task 3 • Conduct a search for: • The collocational framework [a+Noun+of] • In looking at the nouns that occur here, can you spot any semantic commonalities? • What does this tell you about the way the structure itself is used, and what it usually means?

  22. [a + Noun + of] • Nouns in this construction are often quantities: • a lot of • a number of • ... • This suggests that this construction itself places a restriction on the semantics of the content words used in it.

  23. Collocational frameworks: final remarks • Sinclair and Renouf did not suggest that any string of words or pattern counts as a collocational framework. • Crucially, there has to be evidence for semantic restrictions on content words. • E.g. [Verb in NP] doesn’t count as a good pattern, because practically any verb can occur in the first position.

  24. Part 3 Colligates

  25. Colligations • Roughly, a collocation at the level of part of speech. • An idea due to Firth. The main question is: • What are the grammatical environments in which a particular word occurs? • One way of answering this question is to look for a word, and then look at the POSs to the left and right.

  26. Practical task 4 • Conduct a search for the word consequence, specifying any word to the right and any word to the left. • Make a frequency count of node tags. • What do you observe?

  27. Some data (Gries 2009) • Left context of consequence • Article • Adjective • ... • Right context: • Of • Preposition • ...

  28. Observations • This operationalisation of the concept of colligation is highly related to the collocational framework of Renouf/Sinclair. • It’s primarily intended to give an idea of the grammatical environment in which a word occurs.

  29. Limitations • Both collocational frameworks and colligations have some drawbacks: • They’re still highly word-based • They focus only on POS (not full syntax) • Their view of grammatical structure is purely linear.

  30. Part 3 Some case studies

  31. Example 1: It as object • Components: • non-referential use of it • object of a verb • followed by an NP or AdjP • Examples (from the BNC): • Many people who use drugs regularly find it difficult to exist in a drug-free world . • You can also find it hard to remember things • in court unless they agree to do so , making it difficult for detainees to challenge the validity

  32. Example 1 continued • Typical analysis: • this construction involves extraposition: People who use drugs find existing in a drug-free world difficult.  People who use drugs find it difficult to exist in a drug-free world • Some empirical observations on lexis (Francis 1993): • 98% of cases involve find and make • some other verbs like think, consider, see to • Possible “meaning”/function of the structure: • a stereotyped way of presenting a situation in terms of how it is evaluated • evaluation is placed after the verb

  33. Example 2: appositive clauses • Apposition: • a relation between an NP and another phrase which refers to the same thing (Leech and Svartvik, 1975) • Examples: • your daughter, the lawyer, is here • In English, can also occur with that-clauses and to-clauses: • the newsthat your daughter was here • the plotto assassinate the president

  34. Example 2: appositive clauses • Distinguished from restrictive relative clauses: • the dog that I saw yesterday • restricts the reference of the head noun • Appositive clause: • the fact that I came • does not restrict the reference of the head noun • “amplifies” or “qualifies” the head noun

  35. Example 2: Appositives • Appositive that-clauses (BNC): • The fining of airlines plus the fact that the nationals of many refugee-producing countries • as firm as the Emperor Augustus about the principle that a ruler's actual appearance matters less • Traditional grammars (Leech and Svartvik 1975): • “head noun must be an abstract noun” • Question: • what are the lexical restrictions here? • do they have implications for the function of this syntactic structure?

  36. Levels of stereotypicality in syntax • Phraseological constraints: • the co-selection of particular lexical items within a particular syntactic structure • These seem to range on a continuum. • At one extreme: fixed, unchanging constructions (behave like multi-word lexical items) • At the other: complete freedom in lexical selection.

  37. Phraseology • Completely fixed idioms: • it never rains but it pours • Less fixed idioms: • put on a brave face • putting a brave face on … • put a good face on… • Some room for lexical manoeuvre • Semi-prepackaged phrases which allow for variation: • I haven’t the faintest/foggiest/remotest idea/notion • Highly nebulous lexico-syntactic dependencies: • be a case of X • a case of déjà vu • a case of take the money and run • …

  38. Syntactic “fixedness” • Given the cline from fixed to flexible, some linguists (e.g. Francis 1993) suggest that the distinction between “lexicon” and “syntax” is arbitrary. • This argument is based on phraseological constraints observable only in very large corpora. • This is not too far from recent positions in Generative Grammar: • Jackendoff (2002)’s parallel architecture; • Construction Grammar (e.g. Goldberg, 1995)

  39. The “item” and the “environment” • Francis proposes that the distinction between “lexical item” and “syntactic environment” only be used for convenience. • Proposed method: • look at a syntactic environment • discover lexical regularities • focus on a subset of the lexical items • discover further generalisations about the grammar of those items

  40. Case study: Extraposed it-clauses • One of the most frequent adjectives is possible: • it is possible to hire a car • it is possible that it will rain • Proposed interpretations: • that-clause is used for possibility • to-clause is used to express ability • This suggests that possible might have (at least) two different meanings.

  41. The grammar of possible • Further patterns involving possible: • article + superl. adj. + possible + noun the best possible start • as … as possible • … • Main idea: specifications of possible grammatical environments of the item can help specify its range of meanings. • these examples seem to confirm the ability/probability use of possible

More Related