150 likes | 285 Views
Sentence Processing:. Multiple constraints in action. Redundancy simplifies computation. We've seen how redundant cues of many kinds impact on word recognition Today we will look at how such cues help us to decipher information in sentences so well
E N D
Sentence Processing: Multiple constraints in action
Redundancy simplifies computation • We've seen how redundant cues of many kinds impact on word recognition • Today we will look at how such cues help us to decipher information in sentences so well • Why do we need redundant sources of information? • In information processing, redundancy removes ambiguity: as in the error-checking bits in computer communication • Uncertainty is decreased (probability of correct interpretation is increased) whenever something you know narrows down the range of what you don't know • Any regularity is by definition informational = increases probability of correctly predicting what you don't know (this is what it means to have a regularity)
Pattern and information “Any aggregate of events or objects open (e.g., a sequence of phonemes, a painting, or a frog, or a culture) shall be said to contain ‘redundancy’ or ‘pattern’ if the aggregate can be divided in any way by a ‘slash mark’, such that an observer perceiving only what is on one side of the slash mark can guess, with better than random success, what is on the other side of the slash mark. We may say that what is on one side of the slash contains information or has meaning about what is on the other side. Or, in engineer’s language, the aggregate contains ‘redundancy’. Or, again, from the point of view of a cybernetic observer, the information available on one side of the slash will restrain (i.e, reduce the probability of) wrong guessing.” Gregory Bateson StepsTo An Ecology Of Mind P. 104
It’s good to be on top! • Sentences benefit form being at the high-end of the linguistic hierarchy • Constraints from many levels help disambiguate sentences: syntactic, semantic, prosodic, pragmatic & ‘probabilistic’ • They are all ‘probabilistic’ but the last category emphasizes that even if you know nothing at all about syntax, semantics, and pragmatics, language is still not random: some words appear more often than others, and some words are more likely follow certain words than others. • Recall Tom Landauer’s claim: 45% of sentences can be re-contructed from their words by maximizing co-occurrence probabilities
Order in probability • If language were entirely random, then all words would be equally likely, and we already know that: a.) Some words are much more frequent than others b.) The language system is sensitive to word probabilities- HF words are recognized more quickly than LF words c.) Some words have a very narrow range of words that can follow them i.e. articles are almost always followed by adjectives or nouns; never by verbs, articles, or prepositions
Order in probability • If language were entirely random, then all words would be equally likely, and we already know that: d.) Syntactical constraints specify that words have a non-equal probability of appearing in certain places i.e. subject usually before verb before object, or heads in leftmost position in VP and NP) - But this is much enhanced by semantics: Compare ‘the N Ved into the X’ to ‘The train pulled into the X’ e.) Syntactical constraints specify that clauses have a non-equal probability of appearing in certain places i.e.'if', 'in order to', 'because', 'since', 'whenever’ etc. all require closing clauses)
The psychological reality of syntax • Experimental evidence shows that words within a single clause are read more quickly than between-clause words • Readers are especially slowed when they reach an ambiguous section (garden path sentences like ‘The old man the boats’- we expect to discover what the old man did) • If you stop sentences midway and ask people to recall what they sentence just heard was, they tend to report by clause boundaries
The psychological reality of syntax • Other evidence also shows that clauses have psychological reality • If you play a click in the middle of auditory presentation of a sentence, subjects are much better at saying afterwards where it was if it comes between clauses than if it occurred within a clause • Moreover, they tend to report is as occurring at a clausal boundary far more often than it actually had occurred there. • This effect remains even if you remove accompanying disambiguating information such as pauses and changes in intonation that usually accompany clausal boundaries: i.e. in a robotic monotone voice clicks at clause boundaries were still better recalled than clicks within-clauses. • You get it even with (fake) subliminal clicks
Memory or language? • One question about these studies is whether is was a memory or language perception effect • If you ask people to indicate by pointing to a written sentence rather than by speaking, the effect was reduced, suggesting a memory effect • That is: suggesting that the effect may be due to chunking • But it is nevertheless still present- suggesting a language perception effect.
Memory or language? • Moreover…we can always m’u the distinction between language and memory • Following Chomsky, there may be a deep relation between how syntax chunks words into role and clauses, and how we think • Readers show pauses between clauses, and an effect of how many chunks there are to integrate (more = longer pauses) • Following Fauconnier, there may be a deep relation between how we carve up and chunk experience, and how that chunking is mapped onto our language system, so that memory chunking is linguistic (or reflects the same psychological constraints) • Even single words show amazing memory properties (Terry Deacon) - and then, so do narratives/stories/myths • Language is in part a technology for memory enhancement
Order in probability • If language were entirely random, then all words would be equally likely, and we already know that: f.) Pragmatic rules limit what can come next to being related to what has already been communicated or what is currently happening.
Pragmatics: Grice’s Maxims • There are a great many complex pragmatic rules, but the most well-known are known as the Gricean Maxims: i.) The Maxim Of Quality: Speakers should tell the truth as they know it, or explicitly acknowledge their uncertainty about the truth if they are aware of it ii.) The Maxim of Manner: Speakers should strive to be clear, succinct, and unambiguous iii.) The Maxim of Quantity: Speakers should say all that is necessary or required, but no more than that iv.) The Maxim of Relation: Speakers should say only what is relevant
Order in probability • If language were entirely random, then all words would be equally likely, and we already know that: g.) Semantics contributes to disambiguating uncertainty • The sentence 'The witness examined by the lawyer was useless' is read more slowly than the sentence 'The evidence examined by the lawyer was useless', even though both sentences have identical structure and phoneme count • The reason is that witnesses are active beings who can examine but evidence isn’t active and so cannot examine- it can only be examined • THE phrase 'The witness examined' is ambiguous at that point in a way that 'The evidence examined' is not. • We say the second sentence is semantically constrained in a way that the first is not.
Order in probability • If language were entirely random, then all words would be equally likely, and we already know that: h.) We can (and we do) use prosody, stress, and pauses to disambiguate sentences such as these when they are spoken • i.e. speakers tend to automatically lengthen the final vowel in a word just before a clause boundary and also insert a pause
Context matters • Each one of these things contributes takes away a little bit of uncertainty in interpreting sentences • This can be shown experimentally: increased context makes words more likely to recognize under conditions of decreased exposure to that word or increased noisiness, and makes those words more likely to be recalled • Grosjean: words in context are recognized in 175-200 ms of their onset (half their length); words out of context need over 100 ms more (average = 300 ms) • Cohort model applies to sentence processing too