Why syntax is impossible
This presentation is the property of its rightful owner.
Sponsored Links
1 / 33

Why Syntax is Impossible PowerPoint PPT Presentation


  • 60 Views
  • Uploaded on
  • Presentation posted in: General

Why Syntax is Impossible. Mike Dowman. Syntax. Languages have tens of thousands of words Some combinations of words make valid sentences Others don’t No one understands the grammar of any language. Syntax is Complicated!. I saw Bill with Mary yesterday.

Download Presentation

Why Syntax is Impossible

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Why syntax is impossible

Why Syntax is Impossible

Mike Dowman


Syntax

Syntax

  • Languages have tens of thousands of words

  • Some combinations of words make valid sentences

  • Others don’t

  • No one understands the grammar of any language


Syntax is complicated

Syntax is Complicated!

I saw Bill with Mary yesterday.

You saw WHO with Mary yesterday?!

Who did you see with Mary yesterday?


Syntax is complicated1

Syntax is Complicated!

I saw Bill with Mary yesterday.

You saw WHO with Mary yesterday?!

Who did you see with Mary yesterday?

I saw Bill and Mary yesterday.

You saw WHO and Mary yesterday?!


Syntax is complicated2

Syntax is Complicated!

I saw Bill with Mary yesterday.

You saw WHO with Mary yesterday?!

Who did you see with Mary yesterday?

I saw Bill and Mary yesterday.

You saw WHO and Mary yesterday?!

Who did you see and Mary yesterday?


Generative grammar

Generative Grammar

  • An explicit formal system that defines the set of valid sentences in a language

  • And maybe also explains what each one means

  • Generative grammar is the core research topic in linguistics

  • Includes strongly nativist theories and theories proposing that languages are primarily learned


Grammar writing

Grammar Writing

  • Linguists take a selection of possible sentences

  • And obtain grammaticality judgments for those sentences

  • Then they produce a grammar that accounts for all the data


Grammar coverage

Grammar Coverage

  • Linguists’ grammars only work for selected sentences

  • They can’t explain most naturally occurring sentences

  • The more data we consider the more surprising quirks of syntax that emerge


Children s language acquisition

Children’s Language Acquisition

  • Kid’s observe a limited number of example sentences

  • But quickly internalize a system that correctly characterizes the whole language

E-language

LAD

I-language


How can kids do syntax when linguists can t

How can kids do syntax when linguists can’t?

  • Innate component of language (provided by genes)

  • Learned component of language (provided by language data)


How can kids do syntax when linguists can t1

How can kids do syntax when linguists can’t?

  • Innate component of language (provided by genes)

  • Learned component of language (provided by language data)

  • Linguists have to infer both

  • Children only the learned component


Information theory

Information Theory

  • Both components of language must contain some amount of information

  • Data available to children must provide at least enough information as is in the learned component

  • This puts a limit on the complexity of the learned component of language


Linguists task

Linguists’ Task

  • Linguists need to have at least as much information as is in the learned and innate components together

  • Can use data from multiple languages to try to characterize innate components

  • And can use positive and negative data


Correspondence to linguistic theories

Correspondence to Linguistic Theories

Small learned component = parameter setting

Large learned component = learned languages

Small innate component = general learning mechanism

Large innate component = universal grammar


Size of each component

Size of Each Component


Which component is large

Which component is large?

  • As we haven’t yet managed to produce a generative grammar, at least one of innate or learned components must be large

  • Children learn relatively easily, so the learned component can’t be too big


Size of each component1

Size of Each Component


How big could the innate component be

How big could the innate component be?

  • Genome contains 3 billion base pairs = 6 billion bits

  • Cell metabolism adds more information

  • Each base pair can be modified

  • Huge amount of information!


What could be in a huge innate component

What could be in a huge innate component?

  • Not words forms - vary from language to language

  • Grammaticality patterns

  • Rules of syntax would be hugely complex


Impossibility of syntax

Impossibility of Syntax

  • Grammaticality judgments on average can provide no more than one bit of information each

  • If syntax is hugely complex, there will be many grammars that are compatible with any given body of data

  • But all but one of these grammars would fail when tested on enough new data


A concrete example

A Concrete Example

  • A multi-agent model

  • Each agent has:

    innate component

    learned component

  • Both are bit strings of fixed length

  • Sentences are 100 bit strings


Deciding on the grammaticality of a sentence 1

Deciding on the Grammaticality of a Sentence 1

  • Treat the sentence as a binary number

  • Find:

    bi = s mod ni

    bl = s mod nl

    b is an index to a bit in the innate (bi) or learned (bl) component

    n is the number of bits in the innate (ni) or learned (nl) component

    s is the length of the sentences


Deciding on the grammaticality of a sentence 2

Deciding on the Grammaticality of a Sentence 2

  • A pseudo-random function maps from the two selected bits plus the sentence to a Boolean grammaticality judgment

  • It’s therefore typically necessary to know every bit of the sentence and both the innate and learned bits to predict the grammaticality of the sentence

     Every bit counts

  • Usually about half of sentences are grammatical, half ungrammatical


4 kinds of agent

4 Kinds of Agent

Teacher

Innate: 10101000

Learned: 10010101

Related

Innate: 10101000

Learned: 11110001

Unrelated

Innate: 10110101

Learned:00111000

Linguist

Innate: 00110100

Learned: 10001100


Learning by related unrelated

Learning by Related, Unrelated

  • Observe a sentence from the teacher

  • Work out if it is grammatical according to current I-language

  • If not, invert the relevant bit of the learned component


Grammar inference by linguists

Grammar Inference by Linguists

  • Choose random sentences

  • Ask the teacher if they are grammatical

  • Store all sentences and grammaticality judgments

  • Search for a setting of innate and learned components that assigns the correct grammaticality rating to every sentence


1 000 bit innate and learned components

1,000 Bit Innate and Learned Components


1 000 bit innate component 1 000 000 bit learned component

1,000 Bit Innate Component 1,000,000 Bit Learned Component


1 000 000 bit innate component 1 000 bit learned component

1,000,000 Bit Innate Component 1,000 Bit Learned Component


Implications of impossible syntax

Implications of Impossible Syntax

  • A linguist can write a grammar that will adequately characterize any body of data

  • But it will fail when tested on new data

  • Partial grammars are not a stepping stone to complete generative grammars


A universal law of generative grammar

A Universal Law of Generative Grammar

Generative grammar is impossible if:

H(learned component) + H(innate component) > H(language data)

Unless we can use information from another source (genetic, neuroscientific, psycholinguistic)


Why do syntax

Why do Syntax?

  • Studying generative grammar may tell us something about the human mind

  • It won’t help us build natural language processing systems

  • Is studying rare and obscure constructions the best way to do syntax?


Conclusion

Conclusion

  • The idea that we can characterize a language by considering enough linguistic data is a hypothesis

  • It’s very unlikely that it’s possible to write a complete generative grammar


  • Login