1 / 10

Weekly Report

Weekly Report. Semantic Web Research Center Duhyeon Jin 2011-4-15. Contents. Goal Last discussion Problems & resolutions Achievements Plan. Goal. To construct case frames for 675 “PN+’ 하 ’” verbs Currently, automatic construction from Sejong Treebank. 43,828 trees.

trory
Download Presentation

Weekly Report

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Weekly Report Semantic Web Research Center Duhyeon Jin 2011-4-15

  2. Contents • Goal • Last discussion • Problems & resolutions • Achievements • Plan

  3. Goal • To construct case frames for 675 “PN+’하’” verbs • Currently, automatic construction from Sejong Treebank 43,828 trees Total 9,598 treesfor 675 verbs ?,??? Case frames

  4. Last discussion • Problem • Extracted only 2,701 caseframe instances from 9,598 parse trees  many arguments are missing. • Understanding about a tree structure of Sejong Treebank. • Grammatical characteristic of Korean that applied to Sejong Treebank.

  5. Resolutions • Problem1 • I missed dative & locative arguments(N3) from Sejong Treebank, only considered subject(N1) and object(N2) • Resolution • In Sejong Treebank the case of “으로, 에, 에게, 에서..” isregarded as AJT(adjunct) • Let algorithm extract “X_AJT” tag

  6. Resolutions • Problem 2 • Auxiliary verb blocks algorithm to get arguments that could be originally the complement of target verb. • Ex> ~수 있, • Resolution • Allow the verb beside “수/NNB” to be a head word in the tree VP_MOD VP_MOD

  7. Resolutions • Problem 3 • Missing huge number of relative clause • Ex> 만두를 먹은 철수가, 철수가 먹은 만두가, 철수가 살던 곳이.. 만두를 먹은 철수를, 철수가 먹은 만두를… • Resolution • Modify algorithm to extract NP which is modified by VP_MOD or S_MOD

  8. Achievement • Extracted 5,354 case frames more • All workflow is described on the web: • http://sysx2.kaist.ac.kr/wiki/index.php/세종_구문_분석_코퍼스_논항추출_작업 • Not well-fomed tree and no-argument case frames are in the rest 2,137 trees. Extracted case frames 7,461 2,107 7,491 2,137

  9. Refining Extracted arguments • Currently doing Refining extracted data • Extracted Arguments still has problems • Problem1. unnecessarily extracted Adjunct argument, (Ex>동작을본능적으로표현하..) • Problem 2. modifiedNPs that cannot be a argument (Ex> 문제를해결할 힘) ( power can not resolve the problem. ) • Resolutions • 1. Compare arguments to Sejong Dictionary • 2. Manual Checking for 1,134 case frames that extracted from relative clauses.

  10. Plan • After refine extracted data,(~4.15) • Assigning concepts to each arguments using dictionary and CoreNet (4.18 ~ 22) • Refining and Evaluating All extracted case frames(4.25 ~ 29)

More Related