1 / 12

Relation extraction and the influence of automatic named-entity recognition

This study explores the influence of automatic named-entity recognition on relation extraction, presenting a novel approach for extracting relations between named entities from natural language documents. The effectiveness of kernel methods and the impact of noise are evaluated through experiments.

brendar
Download Presentation

Relation extraction and the influence of automatic named-entity recognition

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Relation extraction and the influence of automatic named-entity recognition Presenter : Shao-Wei Cheng Authors : CLAUDIO GIULIANO, ALBERTO LAVELLI, and LORENZA ROMANO TSLP 2007

  2. Outline • Motivation • Objective • Methodology • Named-entity recognition • Kernel Methods for Relation Extraction • Experiments • Conclusion • Personal Comments

  3. Motivation • Information extraction aims at extracting structured information from unstructured or semi-structured textual documents. • As a matter of fact, NER performance is far from perfect, and its influence on relation-extraction performance is still an area of investigation. • Named Entity Recognition • Relation Extraction 3

  4. Objectives • The authors present an approach for extracting relations between named entities from natural language documents. • Evaluated the effect of automatic named-entity recognition on a novel approach to relation extraction. • Relation Extraction • Named Entity Recognition If the relation held, then it is labeled 1, otherwise, it is labeled -1.

  5. Methodology • Named Entity Recognition • Method:CRFs are provided in MALLET. • Processing • (a) the word itself, • (b) the PoS tag of the token, • (c) orthographic predicates • (d) gazetteers of locations, people names and organizations, • (e) character-n-gram predicates for 2 ≦ n ≦ 3. • MO:Corrected entities • MC:Entity boundaries known, but classification not. • MR&C:Entity boundaries and classification aren’t known. “The [New Deal]LOC describes the program of US president Franklin [D. Roosevelt]PER” 5

  6. Methodology • Relation Extraction • Method:SVM. • Kernel methods: • KGC:Global Context Kernel • KLC :Local Context Kernel • KSL :Shallow Linguistic Kernel

  7. Experiments • Dataset • From the papers of Roth and Yih • Evaluation • Cross-validation:Precision, Recall and F-measure • Statistical significance:approximate randomization. • Confidence interval:percentile bootstrap. • The effectiveness of the kernel method. • The influence of the noise. • Compare this approach against the method proposed in Roth and Yih. 7

  8. Experiments • The effectiveness of the kernel method. • Relation extraction training and testing by the correct entities. • Testing by MC • Training by the correct entities. • * Training by the MC. • Testing by MR&C • Training by the correct entities. • * Training by the MR&C. 8

  9. Training by the MO Training by the MR&C Experiments • The influence of the noise. 9

  10. Experiments • Compare this approach against the method proposed in Roth and Yih. • The entities are correctly identified. • The entity boundaries are known. 10

  11. Conclusion • The method has already demonstrated state-of-the-art performance when applied in the extraction of protein-protein interactions from biomedical literature. • The experiments reported that applied to the newswire domain, the combined kernel is still consistently superior, mainly in term of precision, to its basic parts and that it significantly outperforms previously proposed approaches even in presence of noise introduced by an automatic entity tagger. • Evaluate the contribution of syntactic information to relation extraction. • Extend the application of the proposed methodology to a different and wider set of relations. • The possibility of reducing the dimension of the training set using unsupervised technique.

  12. Personal Comments • Advantage • … • Drawback • … • Application • Relation extraction • Named-entity recognition

More Related