1 / 29

Syntax:

Syntax:. If words “are more like humans than machines”- Let’s party!. What is syntax?. Syntax is, as you know, the process which governs the way in which words are combined together. But to understand it, we need to start by understanding functions. The nature of computation.

jack-wood
Download Presentation

Syntax:

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Syntax: If words “are more like humans than machines”- Let’s party!

  2. What is syntax? • Syntax is, as you know, the process which governs the way in which words are combined together. • But to understand it, we need to start by understanding functions

  3. The nature of computation • Syntax is a form of computation • Computation is essentially a mapping: a  b • In the simplest ‘computers’ (finite automaton), the mappings are deterministic, from state to state • State a State b • In more complex machines, we get non-deterministic mappings depending on context (memory): the same state may map onto to any one of several states due to memory, and that memory may be under control of the machine itself • Alan Turing (1936): Very simple machines can compute anything computable

  4. Functions • A function is just a mapping from a specific input to a specific output • The input and the input don't have to be numbers • NameProf(x) takes in the number of a course and maps it onto the name of the person teaching that course • So: NameProf(357)  Westbury • RazeTheHouse(x), which takes a house as input, and returns that house destroyed as output

  5. Primitive Functions • A primitive is a lowest function: the one that can't be defined in terms of any other • Let’s consider an old favorite: ‘+’ • If we want to define a non-primitive function ‘AddOne’ we can: AddOne(x) = x + 1 • We haven’t added new functionality: we’ve just re-named what we had in a way that is convenient

  6. Functions of Functions • Functions can call other functions, including themselves (recursion) • Let's define a function AddTwo, which adds two • We already have AddOne • We can just say define AddTwo as: AddTwo  AddOne(AddOne(x))  AddOne(x + 1)  (x + 1) + 1 • We haven’t added new functionality: we’ve just named something in a way that is convenient

  7. Functions of Functions of Functions • Let's say we want to define a function AddThree, which adds three • We already have AddTwo, and AddOne • We can just say define AddThree as: AddThree  AddTwo(AddOne(x)) • At some point we may get tired of this game: We are wasting time and energy trying to name all these silly little functions, AddOne, AddTwo, AddThree….Will it never end?

  8. Generalizing functions • A more general solution would add ANY number to any input • But we already know how to do that, since we have addition as a primitive: AddN(x,n)  x + n • Notice the difference we had to introduce: we had to add a second input or parameter • Why? Because the way AddOne was defined had a constant in it • We just said "let that constant be a variable"- and so we got a much more powerful function that eliminated the need for thousand of other more specialized functions: all the AddOne, AddSixteen, AddSeventy etc.

  9. The magic of parameters • By adding one variable we got rid of an infinite number of functions, collapsing them all into a single function with two arguments • What we noticed, in essence, is that all cases of addition were similar- they could be all computed in the same way we were computing our primitive, + • Parameterization can be traded off against computation

  10. Hey, what about language? • This is the kind of functional collapse that Chomsky wants to do • He wants to show that many things that appear to be different are minor variations of the same function, just in the same way that AddOne and AddThreeHundred are minor variations of the same function • He wants to do it in the same kind of way we did: by saying, look, you have N functions here that are really just 1 function, plus an extra parameter

  11. How little can we get by with? • The question becomes: What is the simplest representation of the computation that is sentence-making? • This breaks down into the related questions: • What are the most primitive functions? • What are their parameters? • If we can identify a few primitive universal functions and some universal parameters, we may find deep underlying similarities between languages that appear on the surface to be different (as multiplication might appear different from addition at first sight)

  12. What syntax is not • One possible way syntax might work would be Markov chaining: i.e. probabilistic word chaining • Calculate the likelihood that one word follows another (transition probability) , and then only select from those words that actually have a probability > 0 of following a word • A frequentist approach

  13. Two arguments against chaining • Chomsky's initial claim to fame is that he claimed to have proven that there is no possible way that word-chaining devices could account for syntax • Not everyone is convinced, but everyone does agree that simpleword-chaining devices won't work • Chomsky basically had two main argument against them: • i.) Zero probability transitions • ii.) Relational dependencies

  14. i.) Zero probability transitions • We can produce and understand transitions that have zero probability (= have never been encountered before) • i.e. 'colorless green' and 'sleep furiously' had probably never been uttered before Chomsky said it, but we can all agree that it is grammatical, therefore grammar cannot be only transitions • This means we can’t be chaining on words • It also indicates the autonomy of syntax from semantics • We can judge grammaticality of sentences independently of their meaning

  15. ii.) Relational dependencies • Some sentences contain relational dependencies of a kind that simply cannot be captured by transition probabilities • For example, consider: "If I show you this sentence, then you will understand the problem” • there is a long-distance dependency from 'if' to 'then' that can (provably) not be captured by a particular kind of transition-calculating device called a finite state machine • In normal language, we can say that the problem is simply than transition devices don't have a memory, so they can't 'force' a later transition to match an earlier one. • An aside: There are ways to make transition devices deal with these problems, but they require all sorts of very clunky machinery (requiring hugely redundant encoding) that seems very implausible

  16. ii.) Relational dependencies • The problem gets even more complicated because we can embed long-distance dependencies • Consider: "If either I show you this sentence or I explain the problem clearly, then you will understand what Chomsky's point was.” • Now we have a sentence we can all understand, but we have a second dependency: the 'if' has to first close up the 'either' clause and also remember that it is needing a 'then'. • There is not necessarily a simple lexical marker: I can also say "If I show you this sentence or I explain the problem clearly, you will understand what Chomsky's point was."- now there is no 'or' or 'then' to trigger the memory • Listen to language, you'll see the point: such long-distance are not all rare, but occur in many sentences and from a very early age.

  17. ii.) Relational dependencies • There is well known grammatically-correct sentence that ends with 5 prepositions closing 4 embeddings, said by a young child to his father: "Daddy, what did you bring up that book that I don't want to be read to out of up for?" • By the time he gets to "read" the child has to remember the following dependencies: • i.) 'to be read' requires 'to’ • ii.) 'that book that' requires 'out of’ • iii.) 'bring' requires 'up’ • iv.) 'what' requires 'for’ And he does….!

  18. Sentences aren’t beads on a string • Chomsky's solution was one that many take for granted now: it was to suggest that sentences are not flat lists of words, but have a tree structure, and that it is the not the individual words, but parts of the tree that are the units of language • i.e. syntactical constraints are not at the single word level but at the role level, where a role may be played by a multiword string or a single word • Each element that can fill a role is called a constituent

  19. A constituent • An example is a NP (noun phrase), which is defined in Chomsky's original tree notation as (det) A* N • This just means that it contains exactly one optional determiner (like 'a', 'the', 'some', 'many') plus any number of adjectives (including 0) plus a noun. • 'dog' is a noun phrase • So is 'A big hairy rabid frightening nasty dog'

  20. So what? • When our units are defined at the constituent level, instead of the word level, we can easily understand how we can re-use parts in different places as in 'A big hairy rabid frightening dog bit me' and 'I gave the big hairy rabid frightening dog a steak’ • it also impacts on the dependency problem, because we can have trees that constitute a 'memory' for the whole sentence

  21. So what? • We can have functions (= rules) like: • S  Either S or S • S  If S then S • This kind of self-referentiality- in which an object (here, a sentence) is defined in terms of itself- is recursion • Recursion allows for very tightly defined functions, which simplify complex calculations by defining them in terms of simpler cases.

  22. A classic example: Factorial Factorial(x) : If x = 1  1 Otherwise  x * Factorial (x-1)

  23. Calling each other. • With recursion in language you can also calculate a very complex output with very simple rules S  Either S or S S  If S then S • With these two rules we can get sentences like: "If either my big hairy frightening dog is rabid or my unrepaired car brakes are faulty, then either I will be going to the scary grey hospital this afternoon or I will be going mad.” • This seems to match our ‘mentalese’: ‘A big hairy rabid frightening dog’ is certainly a dog, and we want to be able to move our attention around from the dog to the brake and hospital without being ‘thrown off’ by the number of adjectives or qualifying clauses attached to those things in the sentence.

  24. Example • "Tonight's program will discuss stress, exercise, and sex with Celtic forward Scott Wedman, Dr. Ruth Westheimer, and Dick Cavett". • This can be VP  VP NP PP • VP (verb phrase) = ‘will discuss’ • NP (nouns phrase) = ‘stress, exercise, and sex’ • PP (prepositional phrase)  P NP • P = ‘with’ • NP = ‘Celtic forward Scott Wedman, Dr. Ruth Westheimer, and Dick Cavett’

  25. Example • "Tonight's program will discuss stress, exercise, and sex with Celtic forward Scott Wedman, Dr. Ruth Westheimer, and Dick Cavett". • This can also be VP  VP NP • VP = ‘will discuss’ • NP  N PP [‘…sex with Dick Cavett…] • N = ‘stress, exercise, and sex’ • PP  P NP • P = ‘with’ • NP = ‘Celtic forward Scott Wedman, Dr. Ruth Westheimer, and Dick Cavett’

  26. How do we know what is what? • Each part of speech is defined by the role it plays • so a noun is anything that can go in the NP slot • There are two main principles for understanding slots: • i.) The head determines the meaning • ii.) Slots determine what roles each element in a sentence can play

  27. i.) The head determines the meaning • ‘Fox in socks’ is about a fox, not about socks • ‘Flying to Rio before the taxman catches him’ is about flying, not about catching • There are hard rules in every language which determine which component plays the head role • We saw one English rule above: NP  N PP • so, ‘sex with Dick Cavett’ is about a specific kind of sex, not about a specific attribute of Dick Cavett. • ‘with Dick Cavett’ is also a slot, called a modifier

  28. ii.) The choreographing of roles • Slots determine what roles each element in a sentence can play • "Ruth Westheimer discussed sex with Dick Cavett" choreographs three things: the discusser (Ruth), the object (sex), and the recipient (Cavett) • Each one of these roles is called an argument to make clear that they are being fed into a function- that function is determined by the tree structure • every end-point (branch) of the tree has to be filled, so the number of branches = the number of arguments.

  29. So what? • When we start to think of things in terms of trees with arguments, then we can start to see some deep regularities in language • For example, NP and VP turn out to be very similar in their abstract structure… • Tune in next time…

More Related