1 / 12

Syntax and Processing it: Definite Clause Grammars in Prolog (optional material)

Syntax and Processing it: Definite Clause Grammars in Prolog (optional material). John Barnden School of Computer Science University of Birmingham Natur al Language Process ing 1 2014/15 Semester 2. DCGs: Introduction.

sald
Download Presentation

Syntax and Processing it: Definite Clause Grammars in Prolog (optional material)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Syntax and Processing it:Definite Clause Grammars in Prolog(optional material) John Barnden School of Computer Science University of Birmingham Natural Language Processing 1 2014/15 Semester 2

  2. DCGs: Introduction • A way of writing syntactic recognizers and parsers directly in Prolog. • We write Prolog rules of a special type. These look very much like CF grammar productions. • Recognition or parsing happens by the normal Prolog computation process. • Different structures can be recognized/created for the same sentence, by the normal alternative-answer process of Prolog: i.e., natural handling of syntactic ambiguity. • In the parsing case, syntax trees are produced. • Grammatical constraints such as agreement are also easy to include. • The rules can be translated into ordinary Prolog, but with a lot of extra parameters that are tedious to write and that obscure the main information. • The compiler meta-interprets the rules into normal Prolog. • Caution: DCGs provide only top-down depth-first parsing, because of Prolog’s approach to using rules. • But other strategies may be better. More on this later.

  3. DCGs, contd: Recognition • See link on Slides page to a toy recognizer in DCGthat you can examine and play with. • Example DCG rules for recognition of non-terminal categories: • s --> np, vp. • np --> noun, pp. np --> det, adj, noun, pp. • Example DCG rules for recognition of terminal categories: • det --> [a]. det --> [an]. det --> [the]. • noun --> [cat]. noun --> [dog]. noun --> [dogs]. • verb --> [dogs]. • (There is another, more economical method.) • The program can be run in two ways: • s([a, dog, sits, on, a, mat], []). np([a, dog], []). • phrase(s, ([a, dog, sits, on, a, mat]). phrase(np,[a,dog]). • The second argument for s, npetc. is for catching extra words: • np([a, dog, sits, on, a], X). Gives X = [sits, on, a].

  4. Advantage of DCGs over ordinary Prolog • Consider the abstract grammar rules S  NP VP NP  Det Noun • Here’s how they could be implemented in ordinary Prolog (for just recognition, but syntax-tree constructing and grammatical-category checking [see later] can be added) : • s(WordList, Residue):- • np(WordList, Residue_to_pass_on), vp(Residue_to_pass_on, Residue). • np(WordList, Residue):- • det(WordList, Residue_to_pass_on), noun(Residue_to_pass_on, Residue). • det([the | Residue], Residue). • noun([dog | Residue], Residue). • Can be called as in:s([a, dog, sits, on, a dog], []). • Exercise: See ordinary-prolog version of the recognizer linked from Slides page. • Compared to DCG form, have the extra WordListandResiduearguments in every syntactic-category predicate. Tedious, error-prone.

  5. DCGs: Additions • Can embed ordinary Prolog within grammar rules. • Can use disjunction and cuts. • Can add arguments to the category symbols (np, det, etc.) so as to • Build syntax trees, i.e. do parsing, not just recognition • Include “grammatical categories” (used to enforce constraints such as agreement) • Build semantic structures. • Will see some of this in following slides.

  6. DCGs: Parsing • Add a parameter to each category symbol, delivering a node of the syntax tree: • vp(vp_node(Verb_node, PP_node) ) --> verb(Verb_node), pp(PP_node). • verb(verb_node(sits)) --> [sits]. • The program can again be run in two ways: • s(ST, [a, dog, sits, on, a, mat], []). • phrase(s(ST), ([a, dog, sits, on, a, mat]). • See links on Slides page to toy parsers in DCGthat you can examine and play with. • So far: “basic” parser1. • An initial exercise: add new words and new NP rules.

  7. DCGs: Syntactic Ambiguity • Suppose we add two extra rules: • vp( vp_node(Verb_node, PP_node1, PP_node2) ) --> • verb(Verb_node), pp(PP_node1), pp(PP_node2). • np( np_node(Det_node, N_node, PP_node) ) --> • det(Det_node), noun(N_node), pp(PP_node). • Then we get two different structures for • A dog sits on the mat with the flowers. • Exercise: • Work out by hand what structures you should get, both as drawn syntax trees and as Prolog forms. • Try it out using the relevant parser on the Slides page.

  8. Terminals: A Better Implementation • verb(verb_node(Word)) --> [Word], {verb_pred(Word)}. • The part in braces is ordinary Prolog. • Individual verbs are included as follows: • verb_pred(sit). verb_pred(sits). verb_pred(hates). • This is less writing per individual verb, and concentrates the node-building into one place. • Looks possibly less efficient, because of the extra step. • BUT in modern Prologs it speeds up execution: • by making the DCG terminal symbol call (verb in top line above) deterministic • by making the call of the lexical predicates (verb_pred, etc.) deterministic. • Exercise: amend one of the toy parsers by using the above method.

  9. Grammatical Categories • A grammatical category is a dimension along which (some) lexical or syntactic consistuents can vary in limited, systematic ways, such as (in English): • Number singular or plural: lexically, nouns, verbs, determiners, numerals • Person first, second and third: lexically, only for verbs, nouns and some pronouns • Tense present, past (various forms), future: lexically, only for verbs • Gender M, F, N [neither/neuter]: lexically, only some pronouns and some nouns • Syntactic constituents can sometimes inherit grammatical category values from their components, e.g. (without showing all possible GC values): • the big dog: 3rd person M/F/N singular NP // the big dogs: 3rd person M/F/N plural NP • we in the carpet trade: 1st person M/F plural NP // you silly idiot: 2nd person M/F singular NP • eloped with the gym teacher:past-tense VP // will go:future-tense VP • the woman with the long hair: female NP // the radio with the red knobs: neuter NP • A lexical or syntactic constituent can be ambiguous as to a GC value: • e.g. sheep: singular/plural; manage: singular/plural 1st/2nd person

  10. Grammatical Categories in DCGs, contd • Or, using the better lexicon representation: • noun(n_node(Word), gcs(numb(Numb), person(third)) ) • --> [Word], {noun_pred(Word, Numb)}. • noun_pred(dog, singular). • noun_pred(dogs, plural).

  11. Grammatical Categories in DCGs, contd • Enforcing agreement in an NP syntax rule: • np(np_node(Det_node, N_node), gcs(Number_gc, Person_gc) ) • --> det(Det_node, gcs(Number_gc, Person_gc) ), • noun(n_node, gcs(Number_gc, Person_gc) ). • OR more simply, if don’t need to enforce a particular shape to gcs(...): • np(np_node(Det_node, N_node), GCs) • --> det(Det_node, GCs), noun(n_node, GCs). • Enforcing subject-NP / VP agreement (NB: doesn’t handle the case GC) • s(s_node(NP_node, VP_node), GCs) • --> np(NP_node, GCs), vp(VP_node, GCs).

  12. Grammatical Categories in DCGs, contd • Not enforcing agreement within part of a VP rule: • vp(vp_node(Verb_node, PP_node), GCs ) • --> verb(Verb_node, GCs), pp(PP_node). • OR if you needed PP to return some GCs that didn’t matter: • vp(vp_node(Verb_node, PP_node), GCs ) • --> verb(Verb_node, GCs), pp(PP_node, _ ). • Exercise: understand and play around with the GC version of the parser linked from Slides page. • The program can again be run in two ways: • s(ST, GCs, [a, dog, sits, on, a, mat], []). • phrase(s(ST, GCs), ([a, dog, sits, on, a, mat]).

More Related