information density and word order n.
Skip this Video
Loading SlideShow in 5 Seconds..
Information Density and Word Order PowerPoint Presentation
Download Presentation
Information Density and Word Order

Loading in 2 Seconds...

play fullscreen
1 / 35

Information Density and Word Order - PowerPoint PPT Presentation

  • Uploaded on

Information Density and Word Order. Why are some word orders more common than others?. In the majority of languages (with dominant word order) subjects precede objects (SOV,SVO) > VSO > (VOS, OVS) > OSV. Why are some word orders more common than others?. Genetically encoded bias?

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

Information Density and Word Order

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
why are some word orders more common than others
Why are some word orders more common than others?
  • In the majority of languages (with dominant word order) subjects precede objects
  • (SOV,SVO) > VSO > (VOS, OVS) > OSV
why are some word orders more common than others1
Why are some word orders more common than others?
  • Genetically encoded bias?
  • Single common ancestor (SOV)?
  • General linguistic principles
    • Theme-first
    • Verb-object bodning
    • Animate-first
  • Great, but why do these principles work?
uniform information density hypothesis
Uniform information density hypothesis
  • Constant information transmission rate
    • Slower for unexpected, high entropy content
    • Faster for predictable, low entropy content
  • The basic word order of a language influences the average transmission rate
  • Thus languages that are closer to the UID ideal will be more common compared to others further away from it
word order model
Word-order model
  • Simple world with
    • 13 objects (O)
      • 5 people
      • 8 food/drink items
    • 2 relations (R)
      • eat/drink
  • Events in this world consist of one relation and two objects
    • (o1, r, o2)
  • And appear with a certain probability P
word order model1
Word-order model
  • Base entropy (the initial state of the observer before words are spoken)
  • After each word, observers adjust their expectations for the following ones, reaching an entropy of zero after the third word of the event
word order model2
Word-order model
  • Each event has an information profile

I1 = H0 − H1 , I2 = H2 − H1 , I3 = H2

  • Where Hn are entropy trajectories of each word
  • UID suggests a straight line from base entropy to zero entropy such that each word conveys 1/3 of the total information
word order model3
Word-order model
  • UID deviation score
  • Deviation of toy-world events from the “ideal information profile” according to UID
  • VSO > VOS > SVO > OVS > SOV > OSV
corpus study
Corpus study
  • Child-directed speech (English and Japanese corpora)
  • Utterances involving singly transitive verbs
  • Ignored adjectives, plurality, tense etc
  • English: VSO (0.38), SVO (0.41), VOS (0.48), SOV (0.64), OSV (0.78), OVS (0.79)
  • Japanese: SVO (0.66), VSO (0.71), SOV (0.72), VOS (0.72), OSV (0.82), OVS (0.83)
  • Languages must be optimal with respect to the frequencies of events in the real world
  • Judgement tasks for pairs of sentences (which one is more probable?)
  • VSO (0.17), SVO (0.18), VOS (0.20), SOV (0.23), OVS (0.23), OVS (0.24).
  • Object-first word orders are rare
  • Object-first word orders have least uniform information density(first word carries too much information)
  • SOV is not as compatible with the UID as it is frequent in real languages – perhaps due to other important factors beside UID
  • TFP and AFP favor SOV, SVO (highest ranked in the results) and VSO – perhaps UID provides some justification at least for some word order rankings
  • Findings consistent with a weaker hypothesis that word order is optimal wrt the frequency speakers choose to discuss events (not wrt to how often these events really occur)
  • UID may not provide explanation for all of the word order rankings, but does explain several aspects of the empirical distribution of word orders
a noisy channel account of crosslinguistic word order variation
A Noisy Channel Account of Crosslinguistic Word Order Variation
  • In 96.3% of studied languages S precede O
  • SVO (English) and SOV (Japanese) are more prevalent than VSO
  • People construct sentences from and agent perspective – why SVO/SOV then?
  • Innate universal grammar – independent of communicative or performance factors
why sov svo
  • Communicative-based explanation
  • SOV default for the human language
    • Preference for S to precede O
    • Preference for the V to appear in the end of the clause
  • SVO arises from SOV as a result of communication/memory pressures that sometimes outweigh the second preference
shanon s communication theory
Shanon’s communication theory
  • Comprehension and production operate via a noisy channel
  • Speakers are under constraints to chose utterances that will ensure maximal meaning recoverability by the listener
  • When does word order affect how easily meaning can be recovered?
    • The girl kicks the ball. (people should adhere to SOV)
    • The girl kicks the boy.

(potential confusion resolved perhaps by the position of the noun wrt to the verb)

  • Study investigates whether gestured word order across languages (English-SVO, Japanese, Korean-SOV) is depending on semantic reversibility of the event
    • Initial bias to SOV
    • Initial bias to native language
    • Communicative or memory pressures
  • English
    • Shift to SVO (second and third factors)
  • Japanese&Korean
    • Shift to SVO (only due to the third factor)
  • Brief silent animations of intransitive/transitive events
    • First verbally described the animations
    • Then hand-gestured the meanings of the events
  • Verbal and gesture responses were coded for the relative position of the agent, action, and patient
experiment 1
Experiment 1
  • Animate/inanimate patients (reversible or non-reversible sentences)
  • More SVO word orders should be produced if reversible
  • Results – uniformly SVO for verbal responses
    • Gestured S before O for animate patients
    • Gestured V before O for human patients (as expected)
    • Overwhelmingly gestured SOV for non-reversible events
experiment 1 2 japanese korean
Experiment 1&2 – Japanese/Korean
  • English participants’ results can be explained without resorting to noisy-channel hypothesis
    • Participants may shift from SOV to native (SVO) due to increased ambiguity in reversible events
  • Thus, tested participants with a SOV native language
    • Expected shift to SVO in reversible events
  • Experiment 2 – used more complex structures

The old woman says that the fireman kicks the girl

experiment 1 2 japanese korean1
Experiment 1&2 – Japanese/Korean
  • If participants use native word-order (SOV)
    • Then they should gesture both levels of embedded events with the same order:

S1[S2O2V2] V1

  • In case of reversible events SOV creates maximal potential confusion
    • Then they should gesture using SVO:

S1 V1 [S2V2O2]

experiment 1 2 japanese korean2
Experiment 1&2 – Japanese/Korean
  • Exp 1 results – native language word-order
    • J&K speakers verbalized patient before action (100%)
    • Gestured patient before action in both animate and inanimate patients
  • Exp 2 results – shift to SVO
    • J speakers never verbalized SVO; K speakers rarely
    • Both J&K speakers almost always gestured top-level verb in 2nd position between the top-level subject and the embedded subject
    • In the embedded clause patients were gestured before the action almost always, but more often in non-reversible events (both for J&K speakers)
  • Results predicted by noisy-channel but not by the combination of SOV default and native-language order
experiment 3
Experiment 3
  • Alternative explanation of previous results
    • Minimizing syntactic dependency distances
    • Number of words between a syntactic head (verb) and its dependents (subject and object)
    • Shorter dependencies are easier
  • Shift from SOV to SVO given that SVO allows for shorter dependency distances
experiment 3 method
Experiment 3 - method
  • Animations of a boy and a girl interacting with one of a set of objects:
    • Circle/star/heart which was either
    • Spotted/striped (surface); in a box/pail (container); wearing a top/witch’s hat (headwear)
    • Giving/putting/intransitive event
  • Participants were to gesture each event and the features of the object
  • If sensitive to distance b/n agent and verb, then higher SVO gesture order for longer patient descriptions
  • No such shift predicted by noisy channel – patient is not a possible agent of the verb, adding modifiers will not affect the recoverability of who is doing what to whom
experiment 3 results
Experiment 3 - results
  • Gestured patient before action for most of events
  • Verbalized action before patient for most of events
  • Even with long productions still gestured patient before action, consistently with the noisy-channel hypothesis and not with the dependency-distance hypothesis
  • English speakers have a strong SOV preference for non-reversible events even when the inanimate patient has up to 3 features to be gestured
  • SOV seems to be the preferred word order in human communication
  • For reversible events the preference for SOV disappears in favor of SVO
  • Although SOV-natives gesture SOV in simple events, they revert to SVO for more complex ones
  • This shift to SVO occurs in order to maximize meaning recoverability
  • Case marking is often used in SOV
    • Mitigates the confusability of subject and object, helping to retain the default SOV
  • If no case marking is used, then SVO shift
  • Large majority of SOV languages are case marked, whereas few of SVOare
  • Used location in space as possible case markingin the experiments
    • Of the case-marked gestures most had SOV order
  • Animacy-dependent case marking
    • Many languages mark only animate direct objects
  • Non SVO languages have more word-order flexibility than SVO
    • Contain other mechanisms for disambiguation
    • So fixed word orders mostly SVO
  • No need for sophisticated innate machinery to explain word-order variation
  • Many aspects of crosslinguistic word-order variance are easily explained by communicative or memory pressures