1 / 22

Parsing by chunks

Parsing by chunks. Steven P. Abney 발표자 : 박 경 미. 0 introduction. Chunk 는 문장을 읽는 단위 ( 운율 패턴 ) (1) [I begin] [with an intuition]: [when I read] [a sentence], [I read it] [a chunk] [at a time] Chunk 는 타입의 문법적인 전환점 을 표현 전형적인 chunk 는 여러 개의 기능어와 한 개의 내용어로 구성됨 고정된 템플릿과 잘 들어맞음

nate
Download Presentation

Parsing by chunks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Parsing by chunks Steven P. Abney 발표자 : 박 경 미

  2. 0 introduction • Chunk는 문장을 읽는 단위(운율 패턴) • (1) [I begin] [with an intuition]: [when I read] [a sentence], [I read it] [a chunk] [at a time] • Chunk는 타입의 문법적인 전환점을 표현 • 전형적인 chunk는 여러 개의 기능어와 한 개의 내용어로 구성됨 • 고정된 템플릿과 잘 들어맞음 • Chunk의 구조를 기술하는데 CFG가 적당함 • Chunk들 사이의 관계는 어휘 선택에 의해서 결정됨 • Chunk들 사이의 공기는 각 chunk내의 head word에 민감 • Chunk 발생 순서는 chunk내의 단어 순서보다 더 유동적

  3. 1 chunks • Chunk의 존재에 대한 심리학적인 증거 • Gee and Grosjean 1983은 performance structure를 실험 • 다양한 종류의 실험 데이타로부터 나타난 word clustering의 구조가 있음 • Performance structure는 Φ-phrase에 의해 가장 잘 예측됨 • Φ-phrase는 syntactic head 다음에서 input string을 분할 • 예외) 목적격 대명사처럼 이전 내용어와 관련된 기능어는 이전 내용어와 묶임 • 결함 • 관형형 형용사는 syntactic head로 나타나지 않는다고 가정 • 예) “a big dog” 2개의 chunk 가능 • Chunk에 구문 구조 할당하지 않음

  4. 1 chunks • 본 논문에서는… • 문장에 대한 parse tree의 연결된 subgraph를 포함시켜 chunk에 구문 구조 할당 • Major head에 관하여 chunk를 정의 • Major head는 모든 내용어 • 예외) 기능어 f와 f가 선택한 내용어 사이에 나타난 내용어는 제외 • 예) “a man proud of his son”, “the proud man” h : major head Chunk의 root : h가 s-head인 parse tree에서 가장 높은 노드 구의 s-head는 가장 중요한 단어 예) 동사는 문장의 s-head 명사는 명사구, 전치사구의 s-head

  5. 1 chunks • s-head와 syntactic head가 다를 수 있음 • 예) 문장의 head로 abstract element Infl, embedded sentence(CP)의 head로 complementizer(C) (Chomsky 1986) • 예) PP의 head로 명사가 아니라 P • 예) DP-analysis하에서 명사구의 head로 한정사, 형용사구의 head로 degree element (Abney 1987) • s-head를 syntactic head에 관하여 정의 • 구 P의 syntactic head h가 내용어이면, h는 P의 s-head • h가 기능어이면 P의 s-head는 h에 의해서 선택된 구의 s-head

  6. 1 chunks • Chunk C의 parse tree TC : global parse tree T의 subgraph • TC의 root r : C를 정의하는 s-head가 내용어인 가장 높은 노드 • 예) (2)에서 major head는 “man, sitting,suitcase”, r=DP는 s-head가 “man”인, CP는 “sitting”인, PP는 “suitcase”인 가장 높은 노드 • TC는 또다른 chunk의 root를 포함하지 않는 r에 의해서 지배를 받는 T의 largest subgraph • 예) (2)에서 “man” chunk의 parse tree는 DP를 root로 하는 subtree, “sitting” chunk는 CP, “suitcase” chunk는 PP • CP는 완전한 global parse tree를 나타냄, 여기서는 subtree DP, PP가 삭제된 상태 A special case 불연속적인 경계를 포함하는 chunk내의 단어는 배제됨 Complementizer, preposition

  7. 1 chunks • Φ-phrases are generated from chunks by sweeping orphaned words into an adjacent chunk • Φ-phrases, unlike chunks, do not always span connected subgraphs of the parse tree • Ex) in (3), that John constitutes a Φ-phrase; but syntactically, the phrase that John contains two unconnected fragments • The correspondence between prosodic units and syntactic units is not direct, but mediated by chunks • Φ-phrases are elements in a prosodic level of representation • Chunks and global parse-trees are elements of two different levels of syntactic representation

  8. 1 chunks • A final issue regarding the definition of chunks is the status of pronouns • 1. Since pronouns function syntactically like noun chunks, we would like to consider them chunks • 2. they are generally stressless, suggesting that they not be treated as separate chunks • A reasonable solution is to consider them to be lexical noun phrase, and assign them the same status as orphaned words • At the level of chunks, they are orphaned words, belonging to no chunk • At the level of Φ-phrase, they are swept into an adjacent chunk • At the level of syntax, they are treated like any other noun phrase

  9. 1 chunks • About which adjacent chunk orphaned words are swept into • If the orphaned word takes a complement, it is swept into the nearest chunk in the direction of its complement • Otherwise it is swept into the nearest chunk in the direction of its syntactic governor • Ex) 주격 대명사는 뒤의 chunk, 목적격 대명사는 앞의 chunk • The units marked in (1) are Φ-phrases (not chunks)

  10. 2 Structure of the parser • A parser processes text in two stages • A tokenizer/morphological analyzer converts a stream of characters into a stream of words • The parser proper converts a stream of words into a parsed sentence, or a stream of parsed sentences • In a chunking parser, the syntactic analyzer is decomposed into two separate stages • The chunker converts a stream of words into a stream of chunks • The attacher converts the stream of chunks into a stream of sentences • It attaches one chunk to another by adding missing arcs • Ex) in (2), IP-VP lowerVP-PP

  11. 2 Structure of the parser • to illustrate the action of these three stages • Words are sets of readings. Readings, but not words, have unique syntactic categories, feature-sets, etc. “of course” • lexical ambiguity is often resolvable within chunks • There is no distinction in the final parse between nodes built by the chunker and nodes built by the attacher

  12. Chunker3.1 LR parsing • The chunker is a non-deterministic version of an LR parser (Knuth 1965), employing a best-first search • A LR parser is a deterministic bottom-up parser (CFG) • Shifts words from the IS onto the stack until it recognizes a seq. of words matching the RHS of a rule from the grammar • Reduces the sequence to a single node, whose category is given in the LHS of the rule

  13. 3.1 LR parsing • Control is mediated by LR states (a separate control stack) • LR states correspond to sets of items • An item is a rule with a dot making how much of the rule has already been seen • The kernel of an item-set is the set of items with some category preceding the dot • ex) if (4) is at the top of the control stack, we may shift on either Det or N • The new kernel is [NP→N·] • calls for reduction of N to NP • conflict: reduce by rule VP→V NP or shift a V. In this case, lookahead decides the conflict. Shift if the next word is a V, reduce if is no input left

  14. 3.2 grammar • The lexicon includes ’s and possessive pronouns in category D • Modals and to are in category Infl • Selectional constraint • Ex) Aux imposes restrictions on its complement • Ex) a DP whose determiner is ’s does not appear in a PP chunk • incomplete→covers most chunks

  15. 3.3 non-determinism in the chunks • Two sources of non-determinism in the chunker • The points at which chunks end are not explicitly marked • Leading to ambiguities involving chunks of different lengths • A given word may belong to more than one category • Leading to conflicts in which the chunker doesn’t know whether to shift the following word onto the stack as an N or as a V • The aim of using best-first search is… • To approach deterministic parsing without losing robustness • Marcus-style deterministic parsing has two related drawback • The complexity of grammar development and debugging increase too rapidly • If the parser’s best initial guess at every choice point leads to a dead end, the parser simply fails

  16. 3.3 non-determinism in the chunks • The chunker builds one task for each possible next action • A task is a tuple that includes the current configuration, a next action, and a score • A score is an estimate of how likely it is that a given task will lead to the best parse • takes the best task from the queue • executes the task’s next action • producing a new configuration • a new set of tasks are computed for the new configuration • placed on the priority queue

  17. 3.3 non-determinism in the chunks • Executing the first task yields configuration ([[[NP→N·]], [N], 1) • Only one possible next action, [RE NP→N], producing a single new task • a score is a vector of length 4 • values range from 0 to negative infinity • as is desirable for best-first search, scores decrease monotonically as the parser proceeds • this guarantees that the first solution found is a best solution

  18. 3.4 deciding where a chunk ends • A problem with deciding where a chunk ends • Every word has an alternate reading as an end-of-input marker • LP parser treat end-of-input as a grammatical category • One piece of information that we must keep with a task, whether we hallucinate end-of-input marks or not, is which subset of the readings of the lookahead word the task is legal on • Ex) suppose we have just shifted the word many onto the stack as a Q, and the current configuration is: • (6) [[[QP→Q·]], [Q], 1

  19. 3.4 deciding where a chunk ends • The next word is are, which has two readings • Only one legal next action from configuration (6), Reduce QP→Q • That reduction is legal only if the next word is a noun • Since the noun reading of are is rare, we should disprefer the task T calling for reduction by QP→Q • If we keep sets of lookahead readings with each task, • we can slip fake end-of-input markers in among those lookahead readings

  20. Attacher4.1 attachment ambiguities and lexical selection • The attacher’s main job is… • Dealing with attachment ambiguities • 5. prefer argument attachment, prefer verb attachment • 6. prefer low attachment • Potential attachment sites are ranked as follows: • Attachment as verb argument(best) • Attachment as argument of non-verb • Attachment as verb modifier • Attachment as modifier of non-verb • The second factor is relative height of attachment sites, • Counted as number of sentence (IP) nodes below the attachment site in the rightmost branch of the tree

  21. 4.1 attachment ambiguities and lexical selection • Special machinery • The attacher must deal with words’ selectional properties • The lexical selectional properties of a head detemine which phrase can co-occur with that head • A given word has a set of subcategorization frames • There is a good deal of freedom in the order in which arguments appear, but there are also some constraints • Two positional constraints • ‘only appears first’ (annotation on slot: ‘<’) • ‘only appears last’ (annotation on slot: ‘>’) • Arguments are also marked as • Obligatory(), optinal(‘?’), iterable(‘*’) ex) [DP<?, PP*, CP>]

  22. 4.1 attachment ambiguities and lexical selection • A frameset contains a specification of the adjuncts

More Related