cs652 spring 2004 summary n.
Skip this Video
Loading SlideShow in 5 Seconds..
CS652 Spring 2004 Summary PowerPoint Presentation
Download Presentation
CS652 Spring 2004 Summary

play fullscreen
1 / 16
Download Presentation

CS652 Spring 2004 Summary - PowerPoint PPT Presentation

hisa
99 Views
Download Presentation

CS652 Spring 2004 Summary

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. CS652 Spring 2004Summary

  2. Course Objectives • Learn how to extract, structure, and integrate Web information • Learn what the Semantic Web is • Learn how to build ontologies for the Semantic Web • Investigate class-related research topics • Be introduced to Semantic Web services

  3. Generally Applicable Ideas • Semantic Understanding • Data: attribute-value pairs • Information: data in a conceptual model • Knowledge: information with agreement • Meaning: useful knowledge • Measuring Success • Recall: NrCorrect/TotalCorrect • Precision: NrCorrect/(NrCorrect+NrIncorrect) • F-measure: (β2+1)PR/(β2P+R)

  4. Information Extraction • Get relevant information • Not: • Information retrieval: get relevant pages • Web mining: discover unknown associations • Wrapper: maps data to a suitable format • Generation techniques • Machine learning (e.g. RAPIER) • Natural language processing (e.g. RAPIER) • Hidden Markov Models • By-example generation tools (e.g. Lixto) • By-pattern generation (e.g. RoadRunner) • Wrapper Maintenance

  5. Information Extraction – BYU Ontos • Ontology-based • Data frames • Strengths • Resilient to page changes • Robust across sites within the same domain • Works well with all types of data-rich text • Weaknesses • Hand-crafted ontologies and data frames • Requires record-boundary recognition • Does not learn • Applications • Extraction • High-precision classification • Schema mapping • Semantic Web annotation • Agent communication • Ontology generation

  6. Semantic Web • Tim Berners-Lee • “information [has a] well-defined meaning” • “[enables] computers and people to work in cooperation” • Adds context and structure via metadata • Agent computing paradigm • Knowledge markup; semantic annotation

  7. Ontologies • “a formal, explicit specification of a shared conceptualization” [Gruber93] • Formal: machine readable; FOL • Explicit: concepts and constraints explicitly defined • Shared: community accepted • Conceptualization: abstract model (OSM) • “shared vocabulary”

  8. Ontology Formalism Ontology O = <V, A> where V = vocabulary = predicate symbols (each with some arity) A = axioms = formulas (constraints and rules) Predicates: Owner(x), Vehicle(x), Car(x), Truck(x), Owner(x) owns Vehicle(y) Formulas: x(Car(x)Truck(x)  Vehicle(x)) x(Owner(x)  1y(Owner(x) owns Vehicle(y)) Inference Rules: TruckOwner(x) :- Owner(x), Owner(x) owns Vehicle(y), Truck(y)

  9. Semantic Web Ontologies • RDF • DAML+OIL • OWL

  10. Semantic Web Annotationwith BYU Ontos BYU Ontos Extraction Ontology OWL Ontology osm.cs.byu.edu/CS652s04/ontologies/OWL/carads.owl Annotated Semantic Web Page osm.cs.byu.edu/CS652s04/ontologies/annotatedPages/carSrch1_semweb.html

  11. Ontology Generation for the Semantic Web • Necessary for the Semantic Web • Ontology engineering • Tools • Methodology • Languages (e.g. SHOE, OWL) • Semiautomatic generation • NLP + machine learning (e.g. OntoText) • Create from dictionary or lexicon (e.g. Doddle) • Generation from tables (e.g. TANGO) • Ontology maintenance

  12. Ontology Libraries for theSemantic Web • Locating ontologies • Indexing and organization • Search mechanisms • Reusing ontologies • Find one and modify • Find several, merge and modify

  13. Ontology Mapping, Merging, and Integration for the Semantic Web • Ontology reuse • Heterogeneous agent communication • Agent commitment to a new ontology • On the fly: map, merge, integrate (nontrivial to automate) • Can we do well enough? • Can we synergistically involve a user? • Information extraction wrt target • Table extraction (BYU Ontos) • Semiautomatic wrapper/mediator construction by automatically providing mappings

  14. Schema Mapping • Schema-level matchers • Name matchers (dictionaries – WordNet) • Structural context matchers • Instance-level matchers • Value characteristics • Data-frame matchers • Mapping cardinality • 1:1 (direct) • 1:n, n:1, n:m (indirect, complex) • Multi-faceted mapping techniques

  15. Schema Integration • FCA merge using lattices • Global as View (GAV) • Global mediator relations are views over source relations • Dynamic mediator schema – changes to accommodate new sources (hard to add new sources) • Query only requires view unfolding • Good for static, centralized systems • TSIMMIS • Local as View (LAV) • Local source relations are views over mediator relations • Fixed mediator schema – new sources identify components covered (easy to add new sources) • Complex query rewriting • Good for dynamic, distributed systems • Information Manifold

  16. What is your dream for the Semantic Web? • Intelligent personal agents that can: • Gather (just) the information we want and deliver it to us when we want it • Help us with scheduling • Help us buy the goods we want • Negotiate and conduct business for us • … • Intelligent business agents • Intelligent discovery agents • … What can you do to make your dreams come true?