1 / 20

CSA305: Natural Language Algorithms

CSA305: Natural Language Algorithms. Deterministic and Non Deterministic Recognition. Acknowledgement. Material presented adapted from Jurafsky and Martin Ch 2. Representation of Automata using Transition Tables. Transition Table Representation in Prolog. S a b ! s(0,1,0,0). s(1,0,2,0).

jorn
Download Presentation

CSA305: Natural Language Algorithms

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSA305: Natural Language Algorithms Deterministic and Non Deterministic Recognition CSA3050 NLP Algorithms

  2. Acknowledgement • Material presented adapted fromJurafsky and Martin Ch 2 CSA3050 NLP Algorithms

  3. Representation of Automata using Transition Tables CSA3050 NLP Algorithms

  4. Transition Table Representation in Prolog S a b ! s(0,1,0,0). s(1,0,2,0). s(2,0,3,0). s(3,0,3,4). s(4,0,0,0). next(OldState,a,NewState) :- s(OldState,NewState,_,_). next(OldState,b,NewState) :- s(OldState,_,NewState,_). next(OldState,’!’,NewState) :- s(OldState,_,_,NewState). CSA3050 NLP Algorithms

  5. A Better Representation s(0,b,1). s(1,a,2). s(2,a,3). s(3,a,3). s(3,’!’,4). next(OldState,Sym,NewState) :- s(OldState,Sym,NewState). CSA3050 NLP Algorithms

  6. The Process of Recognition 1 • Start in the initial state and at the first symbol of the word. • If there is an arc labelled with that symbol, the machine transitions to the next state, and the symbol is consumed. • The process continues with successive symbols until .... CSA3050 NLP Algorithms

  7. The Process of Recognition 2 One or more of these conditions holds: • A. All symbols in the input are consumed • IF current state is final, succeed, else fail • B. There are no transitions out of a state for the current symbol. • fail CSA3050 NLP Algorithms

  8. Deterministic Recognition • A deterministic algorithm is one that has no choice points • The following algorithm takes as input a tape and an automaton. • returns accept else reject CSA3050 NLP Algorithms

  9. DETERMINISTIC FSA RECOGNITION CSA3050 NLP Algorithms

  10. Skeleton of Prolog Implementation drec(Tape,Machine,State,Result). drec([ ], M, S, yes) :- final(S). drec([H|T], M, S, Result) :- tran(M,S,H,N), drec(T,M,N,Result). drec(_,_,_,no). CSA3050 NLP Algorithms

  11. Failure States • We can regard failure as a special state. • That state is reached by adding supplementary arcs that represent invalid input. CSA3050 NLP Algorithms

  12. Adding a Failure State CSA3050 NLP Algorithms

  13. Deterministic versus Non Deterministic Recognition. • The behaviour of the automata we have considered is fully determined by the current state, and the input symbol. • The recognition process is said to be deterministic • This is not necessarily the case. • Several arcs with the same label. • -Transitions. Arcs with no label. • Automata like this are called non-determinstic CSA3050 NLP Algorithms

  14. Non Deterministic FAs CSA3050 NLP Algorithms

  15. Non Deterministic Recognition • There are three ways of dealing with non-deterministic recognition: • Backtracking: at every choice point, record the state and as yet unexplored choices. • Lookahead: peek ahead n symbols in the input in order to decide which path to take. • Parallel search: look at every path in parallel. CSA3050 NLP Algorithms

  16. ND-RECOGNISE • function ND-RECOGNISE(tape,machine) returns accept or reject • agenda  { (q0(machine),0) } • search_state  NEXT(agenda) • loop • if ACCEPT-STATE?(search_state) = true • then return accept • else • agenda  agenda  GENERATE-NEW-STATES(search_state) • if agenda is empty • then return reject • else current_state  NEXT(agenda) • end CSA3050 NLP Algorithms

  17. ACCEPT-STATE? function ACCEPT-STATES?(search_state) mstate  first(search_state) tape_pos  second(search_state) if tape[tape_pos] = end_input and IS-FINAL?(mstate) then return true elsereturn false CSA3050 NLP Algorithms

  18. GENERATE-NEW-STATES function GENERATE-NEW-STATES(search_state) mstate  first(search_state) tape_pos  second(search_state) return {(x,tape_pos) | x=trantable[mstate,] }  {(x, tape_pos + 1) | trantable[mstate, tape[tape_pos]]} CSA3050 NLP Algorithms

  19. Recognition as Search • Recognition can be regarded as a search problem • Initial state, Goal State • Rules • Strategy • Different search behaviours (depth first, breadth first) can be evoked by managing the agenda in different ways. • See Jurafsky & Martin sect 2.2 CSA3050 NLP Algorithms

  20. Deterministic and Non Deterministic FSAs • The class of languages recognisable by NDFSA is identical to that recognised by DFSA. • For every NDFSA ND there is an equivalent FSA D. • The states of D correspond to sets of states in ND • If N is the number of states in ND, the number of states in D is ≤ 2N CSA3050 NLP Algorithms

More Related