ENG 626 CORPUS APPROACHES TO LANGUAGE STUDIES interpreting concordance lines
ENG 626 CORPUS APPROACHES TO LANGUAGE STUDIES interpreting concordance lines

ENG 626 CORPUS APPROACHES TO LANGUAGE STUDIES interpreting concordance lines Bambang Kaswanti Purwo [Hunston, Ch. 3]. the most basic way of processing corpus information. ▪ find and interpret concordance lines. » search for ۰ a single word-form (e.g. point )

Presentation Transcript

ENG 626


interpreting concordance lines

Bambang Kaswanti Purwo


[Hunston, Ch. 3]

the most basic way of processing corpus information

▪ find and interpret concordance lines

» search for

۰a single word-form (e.g. point)

۰ a lemma (e.g. CONDEMN)

۰ a series of words (e.g. on ADJECTIVE terms with)

۰ a concept that often co-occurs with

(e.g. what would co-occurring with expressions of hypotheticality)

» sort the lines so that the lines that are like each other

in some way appear next to each other [Hunston p. 40]

▪ search for a word (left and right)

critical۰ often follows a form of the verb BE (be or is)

۰ sometimes follows a determiner (a, his, this)

۰ sometimes used in compounds (self-critical)

۰ sometimes follows a grading adverb (highly, more)




critical is ۰ sometimes followed by of, to, and in

۰a different meaning is associated with each preposition

be critical of ‘negative opinion’

be critical to, be critical in ‘important’ meaning

۰ sometimes followed by a noun

(critical clue, critical importance, critical juncture)

۰ syntactically can be used attributively or predicatively

۰ when used attributively, critical is likely to mean ‘important’

۰of and to the most frequent prepositions to go with

▪ search for a phrase or specific word-classes

on ADJ termswith [Hunston, 41]

the ADJ can be grouped according to meaning:

۰familiar, friendly, intimate ‘a degree of closeness’

۰good, reasonable, bad ‘whether or not the two groups like

each other’

۰equal ‘a similarity in status’


What is observable from concordance lines?

▪ observing the ‘central and typical’

▪ observing meaning distinctions

▪ observing meaning and pattern

▪ observing detail

types of observation

central vs. typical: distinction between

▪ can and cannot be used in a particular language

▪ frequently possible and rarely occurs in practice

corpora cannot ▪ offer “negative evidence”

(what is impossible in a language)

▪ determine what is possible

no demarcation between “correct” and “incorrect”

e.g. I’m just sort of showing you perhaps some dishes which

are more healthier than others

a corpus offers info that a NS cannot replicate:

an indication of ‘central and typical’ usage


TYPICAL to describe the most frequent meanings or

collocates or phraseology of an individual word or phrase

see ten randomly selected concordance lines for recipe for (p. 43)

▪ the typical meaning of recipe for: metaphoric, not literal

(only line 10 is an exception to this)

▪ the nouns following for are slightly more frequently negative

(damage 1, failure 4, slump 5, chaos 6, disaster 8) than

they are positive (surprise 2, success 3 n 9)

or neutral (government 7)

▪ when metaphoric, most frequently follows BE n a(lines 1, 3, 4, 6, 8)

most exceptions to this (lines 2, 7, 9) are positive or neutral

[although recipe for has a range of meanings, collocates, and

grammatical co-texts] its typical use is in the sequence

‘something is a recipe for something bad’

a typical example would be line 1: not show all the ways that the

phrase can be used, but it combines all the most frequent features


speakers of a language may have intuition about typicality,

not always accord with evidence of frequency

cf. “prototypical” (Barlow 1996, Shortall 1999): usage commonly

felt to be typical but not necessarily most frequent

English teaching course books tend to present usage which is

prototypical but not typical in the sense of “most frequently


e.g. on “comparatives”

prototypical: The USSR is larger than China(Hsia et al. 1989:178)

[a sample of 100 lines of larger from the Bank of English]

۰only 17 included than

۰ in most lines larger is followed by a noun: a much larger

plan, their larger but poorer northern neighbours

[comparison is implicit]


[reflexive pronouns as herself]

coursebook writers present these pronouns contrastively

be proud of oneself vs. be proud of one’s child

students were asked to produce:

I saw myself in the mirror.

He hit himself with the hammer.

We dried ourselves with the towel.

Barlow (1996) notes

reflexives have phraseologies

quite distinct from those

associated with other pronouns

۰the most frequently used verb is FIND

found myself by the sea very different meaning from

found him by the sea

۰ the other verbs to co-occur with reflexives most frequently

are those indicating thoughts and speech: SEE, IMAGINE,

VISUALISE, CONSIDER, ASK(Barlow 1996:9), rather than

the verbs of physical action (he hit himself, etc.)


observing meaning distinction

many words have meaning that are similar,

yet not substitutable one for the other

[of little help]

dictionaries deal with

the words separately,

rather than comparatively

 observing typical usages of

near-synonyms can clarify

differences in meaning

Partington’s (1998:33-46) study: “semi-grammatical” words

words which by themselves carry only a general meaning

intensifying ADJs: sheer, pure, complete, utter, absolute

(dictionaries tend to define these words in similar ways)

▪ sheer [+ nouns of degree or magnitude]sheer weight, sheer number

▪ in the pattern the sheer N of N: the sheer scale of the shelling

▪ the other ADJs do not collocate with these nouns


observing meaning and pattern

the meaning of a word is closely associated with its co-text

although ambiguity is possible, for the most part the meanings

of words are distinguished by the patterns or phraseologies in

which they typically occur

initiative [n]: three distinct meanings [Hunston p. 46]

1. [a count noun] ‘something that someone (usually a

government agency or other institution) starts to try

to solve a problem’

2. the initiative is used with verbs meaning ‘take’ or ‘lose’

take the initiative ‘start sth and so gain an advantage

over a competitor’; lose the initiative ‘fail to start sth

and so allow a competitor to gain an advantage’

3. ‘the quality of being able to do things without being told’

only the possessive (e.g. their, his) as DET; mostly no DET

 a matter of distinction between patterns n usage

(not meaning and phraseology)


CONDEMN [v]: several different meanings [Hunston p. 47]

each meaning is associated with a particular pattern

1. ‘criticise’: condemn something, condemn sth as sth’

2. ‘pass sentence’: condemn sth to sth

3. ‘sentenced to death’

4. ‘make something bad happen’: condemn sth to sth


observing detail

[so far] concordances be used to give very general ideas

about ۰the ways that words behave

and ۰the meanings that can be associated with patterns

[any work with concordances]

tends to lead to more specific observations about the

behavior of individual words

ANSWER often followed by as to

a clause beginning with a wh- word

advice as to often follows a verb indicating

‘getting’, ‘giving’, ‘wanting’ or ‘offering’ (see Hunston, p. 51]

ANSWER as to tends to follow the same kind of verb

often follows a phrase indicating a clear answer not available

a clear answer difficult

a clear answer unexpected


coping with a lot of data: using phraseology

[one of the problems with the increasing size of corpora]

searches for frequent words yield too much data

to be interpretable in the form of concordance lines

[a corpus user can cope with looking at] about

▪ 100 lines for general patterns

▪ 30 lines for detailed patterns

Sinclair (1999)

selecting 30 random lines n noting the patterns in them

then selecting a different 30, noting the new patterns

then another 30 and so on

 no longer yield anything new

 “hypothesis testing”: a small selection of lines is used

as a basis for a set of hypotheses about patterns

other searches are used to test those hypotheses

and form new ones


SUGGESTION and point

suggestion [n]

20 random concordance lines for SUGGESTION

sorted one to the right of the node-word [Hunston p. 52]

▪ the lines show SUGGESTION frequently followed by

a finite clause (with that or not)

as, to, for, and of

50 more lines are selected

DEL the lines “SUGGESTION + a finite clause”

“SUGGESTION as an ordinary noun”

(my suggestions never got past his desk)

▪ the remaining lines confirm SUGGESTION frequently + of

▪ two lines SUGGESTION + for

▪ no lines SUGGESTION + as to

▪ a new pattern emerges “+ inf. clause (a suggestion to pipe seawater)


point: extremely frequent word in English

Bank of English – 100,000 instances [Hunston p. 55]

20 random concordance lines for point

the phraseology of point is highlighted in bold type

▪ what comes before point?

a point, the point, no point, and so on

▪ what comes after point?

point of, point in, and so on

▪ based on a word-class: possessive followed by point

present participle followed by point

[see Table 3.1]

▪ point is found to indicate the name of a place (line 4)

a way of scoring in a game (line 20)

▪ point is used with this or that [anaphoric] (line 9)


using probes

[so far]

a search for a word or a phrase to gain more information

about that word or phrase

[it is possible]

to use searches to find sets of words or expressions

that cannot easily otherwise be called to mind

 these searches are called “probes”

e.g. how men and women are typically evaluated?

the sequence something/nothing + ADJ + about/in + him/her

to find lists of ADJs used to describe a male or a female person

male: absurd, arresting, attractive, big, candid, dangerous,

decent, disturbing, fantastic, funny, heroic, impatient, etc

female: appealing, bad, dark, decadent, exotic, extraordinary,

obsessive, professional, sacred, special, vulnerable, etc.

Hunston pp. 62-63


issues in assessing and interpreting concordance lines

▪ variation in the kind of search that is possible: using the

word, lemma, or phrase as a target

▪ [with some searches] the need to edit the lines to separate

the target phrase from others that the search has found

▪ the need to sort lines to make the patterning in them

more visible

▪ [often] necessary to look at only part of each line in a set of

concordance lines in order to identify patterning

▪ [conversely] the need to look at more co-text

▪ the need to tackle a large amount of data by looking at

successive groups of a small number of lines, forming, and

testing hypothesis

▪ the need to concentrate on evidence for “central n typical”

▪ the need to consider counter-examples