1 / 39

Margherita Sini Asanee Kawtrakul APAN 2006 –Singapore 20 July 2006

Key step to Ontology and Cross language KM: AOS/CS workbench. Margherita Sini Asanee Kawtrakul APAN 2006 –Singapore 20 July 2006. Outline. Background and Motivation Design Framework Current Status Next Step. Background: Two Requests. Agricultural Information Service

frankl
Download Presentation

Margherita Sini Asanee Kawtrakul APAN 2006 –Singapore 20 July 2006

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Key step to Ontology and Cross language KM: AOS/CS workbench Margherita Sini Asanee Kawtrakul APAN 2006 –Singapore 20 July 2006

  2. Outline • Background and Motivation • Design Framework • Current Status • Next Step

  3. Background: Two Requests • Agricultural Information Service • Facts: Valuable Information sources are scattered, Language barriers, Digital divide • Need: Information Integration  Knowledge Portal • Organizational Knowledge Management • Facts:Information Overload, especially, unstructured electronic articles and reports , • Need: Explicit Knowledge Collection and Sharing. • Demand Driven Researches on • Ontology Construction and Maintenance • Applicationsin Knowledge Portal

  4. Knowledge Portal and Management Information Extraction Knowledge Extraction Ontology Maintenance Knowledge and Ontology Engineering Knowledge Summarization and Tracking: Know who, Know what, ,Know why Corpus Analysis and software Tools Corpus Analysis, Word cut, Sentence Segmentation, EDU Segmentation, Language Engineering and Resources Name Entities Recognition, Parser, Frame, Thesuarus, Lexicon, Grammar, Tree bank

  5. Avian Influenza (dispersion) Situation • Time: 9 ตุลาคม 2547 • Location: อยุธยา • Event: ไก่ล้มตายเป็นจำนวนมาก • Reaction: ประกาศเขตควบคุมโรค Avian Influenza (dispersion) Situation • Time: 9 ตุลาคม 2547 • Location: อยุธยา • Event: ไก่ล้มตายเป็นจำนวนมาก • Reaction: ประกาศเขตควบคุมโรค Avian Influenza (dispersion) Situation • Time: 9 ตุลาคม 2547 • Location: อยุธยา • Event: ไก่ล้มตายเป็นจำนวนมาก • Reaction: ประกาศเขตควบคุมโรค Information Extraction Extraction By using resources From Knowledge Acquisition

  6. เด็ก ตาย ไข้หวัดนก ระบาด อยุธยา ไก่ ผู้ป่วย โรค ป่วย มี คน Language Engineering

  7. Announcement … Control … Prevention Dispersion ระบาด ระบาด(Disease, Patient, Location) ป่วย ตาย ป่วย(Patient, Disease) ตาย(Patient, Cause) Template Connection Sit_Management • Situation • Action Situation • Event • Location • Time

  8. Avian Influenza (dispersion) Situation • Time: 9 ตุลาคม 2547 • Location: อยุธยา • Event: ไก่ล้มตายเป็นจำนวนมาก • Reaction: ประกาศเขตควบคุมโรค Avian Influenza (dispersion) Situation • Time: 9 ตุลาคม 2547 • Location: อยุธยา • Event: ไก่ล้มตายเป็นจำนวนมาก • Reaction: ประกาศเขตควบคุมโรค Avian Influenza (dispersion) Situation • Time: 9 ตุลาคม 2547 • Location: อยุธยา • Event: ไก่ล้มตายเป็นจำนวนมาก • Reaction: ประกาศเขตควบคุมโรค Information Extraction

  9. Warning needs Specific task-oriented Ontology Plant: ข้าว Problem: ขาดแคลนน้ำ Period: กุมภาพันธ์ Suggestion: งดทำนาปรังครั้งที่ 2 ปลูกพืชไร่ที่ใช้น้ำน้อย และพืชผักที่มีอายุสั้น

  10. สภาพแวดล้อมที่เหมาะสมสภาพแวดล้อมที่เหมาะสม พันธุ์ Season price วิธีการปลูก Pest production Plant Variety suggestion การเตรียมเมล็ดพันธุ์ Seed providers ข้อจำกัดของพืช Disease Planting Method วิทยาการหลังการเก็บเกี่ยว Harvesting การเตรียมดิน วิธีให้น้ำ วิธีให้ปุ๋ย Knowledge Portal /Information Integration with Discourse Producer Intention มันสำปะหลัง ถั่วเหลือง ถั่วเขียว ถั่วดำ ถั่วแดง ถั่วพุ่ม ถั่วฮามาต้า ข้าวโพดหวาน ข้าวโพดเลี้ยงสัตว์ ข้าวโพดฝักอ่อน ทานตะวัน ผักกาดขาว กะหล่ำดอก ผักคะน้า ผักกาดหัว Ontology Object list

  11. Pest - name - characteristic - pest control Product processing Material supplier/ price Disease - characteristic/ symptom - treatment Watering/ Fertilizing Agricultural technology/research Agricultural news Weather forecast/ warning Rice Rice variety - characteristic - irresistible pest/disease - resistible pest/disease - area condition - environment resistant - growing season - watering, fertilizing - harvest time - average product Rice market/ Distributor Weed - name - characteristic - weed destroy Product price Harvesting Cultural practice

  12. DomainOntologies Task Oriented Ontologies Ontology Disease www.eto.ku.ac.th Pest www.doae.go.th/ Rice variety www.doa.go.th/ Rice Agricultural technology/ research Weather forecast/ warning WWW Unstructured, Semi-structured, Structured Document System Architecture Meta DataAnnotation tools Document warehouse Knowledge Structure Knowledge Portal Processing External Information Intelligent Search Engine Multilingual Dictionary MT KT

  13. Pragmatic Semantic Analysis Discourse edu analysis, Anahora- resolution Semantic Semantic interpretation Syntax Parser, Chunker Morphology word cut NE Recognition Language Engineering

  14. Knowledge Portal and Management Information Extraction Knowledge Extraction Ontology Maintenance Knowledge and Ontology Engineering Knowledge Summarization and Tracking: Know who, Know what, ,Know why Corpus Analysis and software Tools Corpus Analysis, Word cut, Sentence Segmentation, EDU Segmentation, Language Engineering and Resources Name Entities Recognition, Parser, Frame, Thesurus, Lexicon, Grammar, Tree bank

  15. Motivation: • Ontology as Knowledge of the world for Mutual Information Exchange + + • To create an ontology by an expert is an expensive task, and its maintenance is an endless task, especially for new terms. • To utilize the existing resources: Dictionaries, Thesuarus,Encyclopedia, ++

  16. Design Framework

  17. How we start • What we want  Unified and Universal Model • user requirements: multipurposes • Ex. Bird flue Information Extraction, Knowledge Management about Thai Rice, Health Application, Tourism Application as Supply chain • What we have  Time and Cost Reduction • Multiple resources: reuse ++ • What we do  Tools and Workbench with LE and KE • The Agriculture Ontology Service Initiative

  18. Properties Relationship Object Relationship Ontological Semantic part-of leaf has_Common_Name Plant stem has_Scientific_Name part-of part-of Climber hold hand Tree Shrub annual … concept Cananga odorata Grape Cocciniagrandis … property … instance

  19. Ontological Semantic Processing with ordering Crop husbandry Soil cultivation Post harvest Fertilizing Irrigation (1) (3) (4) (2)

  20. Ontological Semantic Intention of Goals & Planning Problem Solving Root Cause Extraction Best Practice Correction Prevention

  21. Problems

  22. Problems in Dictionaries: coverage, inconsistency etc.

  23. Lexicon Growth

  24. What we are doing and Some Results

  25. Ontology Construction • 3 Sources • Raw Text: Technical paper, Published document • Dictionary • Thesaurus

  26. AGROVOC Thesaurus Thai Plant Name Dictionary Raw Text Example Cereals BT Plant Product NT Oats Rice Maize RT Cereal crops ฟักทอง ฟักทองเป็นพืชผักที่จัดอยู่ในกลุ่มพืชตระกูลแตง ซึ่งได้แก่ฟักทอง แตงกวา แตงร้าน ฟักแฟง มะระ บวบ แตงโม แคนตาลูป ฯลฯ เป็นพืชผักที่มีราคาถูก มีวิตามินเอสูง ช่วยบำรุงผิวพรรณและถนอมสายตา นำมาทำอาหารได้หลายชนิดเช่นแกงเลียง แกงส้ม เป็นต้น หรือ นำมาทำเป็นอาหารแปรรูปเช่นข้าวเกรียบฟักทอง GESNERIACEAE Specific epithet Author Name Is-A Family/Subfamily Genus Habit Chirita Raw Text Is-A Is-A Chirita GESNERIACEAE fulva Barnett HดาดหอยDathoi (Nakhon Si Thammarat). involucrata Craib Hน้ำดับไฟNam dap fai (Surat Thani); มะและMalae(Pattani). micromusa B. L. Burtt H คำหยาด Kham yat (Nakhon Ratchasima). Chisocheton MELIACEAE ceramicus (Miq.) CDC. Tยมใหญ่Yomyai (General). cumingianus (CDC.) Harms subsp. balansar (C.DC.) Mabb.T ยมมะกอกYom makok (Chiang Mai). ดาดหอย fulva Local Name Synonym Plant Product พืชผัก พืชตระกูลแตง IS-A Formal Name อาหาร Production_of Cereal crops Cereals IS-A IS-A IS-A ฟักทอง อาหารแปรรูป แตงกวา ฟักแฟง แกงเลียง Oats Rice Maize IS-A Made-of ข้าวเกรียบฟักทอง Ontology Learning System: Unstructured Corpus Structured Corpus Thesaurus Dictionary OCR Morphological Analysis Annotatation Structure Analysis Term Extraction Grammatical Rules Learning WordNet Features of the Dictionary Define Explicit Rules Identificationof Semantic Relation Lexico-Syntactic Patterns Relation Analysis Semantic Relationship Recycling &Refinement Correction of Concepts & Relations WordNet Rules Heuristic Rules Organizing System VerificationSystem Ontology

  27. Plant Plant Oil Crops + Crops Crops Oil Palms Oil Crops Oil Palms Plant Products Plant Products Fruit + Fruit Fruit Watermelons Tamarind Watermelons Tamarind Organizing System • Use the thesaurus Ontology as the core tree • Merge forest ontology extracted from the dictionary and the texts to the core ontology by using NLP techniques • Phrasal Analysis • Term Matching

  28. Organizing System • Operation a) Add Plant Products Plant Products Fruit + Fruit Fruit Watermelons Tamarind Watermelons Tamarind b) Delete Crops Crops Crops Oil Crops Oil Crops + Oil Palms Oil Palms Oil Palms c) Insert Fruit Fruit Tropical Fruit + Tropical Fruit Durian Durian Durian

  29. More problems

  30. NP1... NP2... NP3... such as NP, NP, ... NP1... NP2... NP3... such as NP, NP, ... Ex1. Many herbscan be used as medicineand some of them are manufactured in the industrylevel, such asgarlic, ginkgo biloba. Candidate Terms => herbs, medicine, industry • Ex2.Sun flower is rather enduring with dry season while comparing to other field cropssuch ascorn, soy bean and green bean. • Candidate Terms => Sun flower, field crop Corpus based Ontology Construction: Need Language Engineering • Problems in this process: • Many Candidate Terms

  31. Agricultural NE recognition • Plant name dictionary • Gather from “ชื่อพรรณไม้แห่งประเทศไทย” by “เต็ม สมิตินันท์” and name that usually occur in corpus. • Size : 17610 names • Animal name dictionary • Gather from internet and name that usually occur in corpus • 3374 names • Pathogen name dictionary • Gather from name that usually occur in corpus • 8 names • Disease name dictionary • Gather from internet and name that usually occur in corpus • 237 names • Chemical name dictionary • Gather from internet and name that usually occur in corpus • 224 names

  32. System Overview AGROVOC Verification Using Expert-Defined Rules Rules Define Explicit Rules Noun Phrase Analysis NP Rules Verification Module WordNet Alignment WordNet Examples Using Trainingstatistics-based Rules Rules Annotation Learning Detection and Suggestion Module Rules Acquisition Module

  33. Next Step

  34. Benefits

  35. Intelligent Search Engine: K - Services • Know-who (tracking for help) • Know-what (structural knowledge, patterns) • Know-why (deeper knowing) • Know-how (skill, procedure) • Know-when (timing) • Know – where (place, context and tracking) Adapted from Skyrme, D. (1999) Knowledge networking: creating the collaborative enterprise. Butterworth- Heinemann, Oxford, p. 46.

  36. User Interface Symbolic Property List of all Input properties Numeric Property Ranked result • Green stink bug 50% • Corn arphids 50% • Hexagon spider 50% • Long-legged spider 5%

  37. Question & Answer User’s question: What causes the rice leaf to be yellow and dry? Keyword: Yellow and dry leaf Answer: • If Brown Leap Hopper damage rice field, the leaf will be yellow and dry .

  38. Acknowledgement • NECTEC, • KURDI • FAO

More Related