1 / 21

Haystack

Haystack. Dennis Quan Oxygen Workshop, January, 2002. Introduction. Personalized information store Semistructured data with arbitrary metadata Unified ontology Standards-based components and infrastructure Compatible with existing systems Example user interface

bobby
Download Presentation

Haystack

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Haystack Dennis Quan Oxygen Workshop, January, 2002

  2. Introduction • Personalized information store • Semistructured data with arbitrary metadata • Unified ontology • Standards-based components and infrastructure • Compatible with existing systems • Example user interface • Integration with mail and groupware concepts • Collaboration possibilities

  3. What is an Ontology? • “The branch of metaphysics that deals with the nature of being. “ – American Heritage Dictionary • Describes relationships between different objects in a system • Like schemata or class hierarchies

  4. Resource Description Format (RDF) • Standard defined by W3C in 1999 (http://www.w3.org/RDF/) • Models statements of the form: <subject> <predicate> <object> • Can be expressed as a labeled, directed graph • For example, statements “Bob likes Alice” and “Bob likes Jane”: Jane likes Bob likes Alice <rdf:RDF xmlns:rdf=“http://www.w3.org/1999/02/22-rdf-syntax-ns#”> <rdf:Description rdf:about=“Bob”> <likes rdf:resource=“Alice” /> <likes rdf:resource=“Jane” /> </rdf:Description> </rdf:RDF>

  5. RDF Store • RDF Store used by Haystack to store all information • Runs off of a standard SQL database • Provides querying facility • Example: who likes Jane? (?x likes Jane); return ?x

  6. Belief • With multitude of information, how much is believable? • Annotate who said what • Also can describe belief network using RDF • Example: John says that Bob likes Jane, and Bob believes John • Belief Server—component of Haystack that evaluates belief network and “filters” the store for information believed by the user Jane likes Bob assertedBy believes John

  7. Collections • Basic means of aggregation • Difference from “folders”: containment versus membership • Categorization and subcategories

  8. Queries • One possible means for constructing a collection (result set) • Can use all possible metadata fields to construct query • Natural language • Multiple query sources—the Web, other people’s Haystacks, etc. • Automatic update of query result sets • Possibilities for machine learning (e.g., when a user removes an item from a result set—a message to Haystack that an object does not belong)

  9. Services • Callable services in Haystack • Also, automatic agents that respond to events • Available methods described in metadata • Haystack service initialization script also described in metadata • Services mainly written in Java, but can be written in any language

  10. SOAP, WSDL and UDDI • Relationship to Web Services standards: • Simple Object Access Protocol (SOAP) http://www.w3.org/TR/SOAP/ • Web Services Description Language (WSDL) http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnwebsrv/html/wsdl.asp • Universal Description, Discovery and Integration (UDDI) http://www.uddi.org/ • SOAP and HTTP/PUT used as protocols for communication between services, including the RDF Store • RDFized version of WSDL used to describe services’ interfaces • UDDI query functionality easily modeled in RDF query

  11. Inference Layer • The semantics defined in RDF often permit deduction • Example: Fido is a dog and dogs are mammals  Fido is a mammal • Deduced knowledge is useful and should be stored • Inference Layer recognizes patterns and triggers agents/services to perform deduction

  12. Views • May be several different ways of looking at an object • Example: appointment book can be viewed as a sortable list of appointments or a calendar • Views are a distinct type of object used to model these different ways of looking at objects

  13. User Interface Ontology • UI components (e.g. JavaBeans, ActiveX controls) rich sources of metadata • Form descriptions also describable with metadata • Possible to construct a directed graph that models a user interface • Similar in concept to XUL • Permits dynamic deduction of user interface similar to XSLT, except semantic rather than syntactic • Part: a Haystack UI component • ViewPart: a kind of part specially designed to display a specific kind of View

  14. SWT • Cross-platform Java widget toolkit • Part of Eclipse project (http://www.eclipse.org/) • Uses native operating systems’ widgets, avoiding performance problems • Used for Part framework • Integrates with Mozilla web browser • Also possible to use ActiveX controls and GTK widgets

  15. Ozone • Haystack experimental user interface • Modeled after a web browser • Uses parts to describe user interface

  16. Browse/Query Paradigm • Browsing: going through nested folders/categories to locate sought item(s) • Query: giving an explicit set of conditions to locate sought item(s) • Ozone adopts hybrid Browse/Query paradigm • Traditional subcategories still present in Collection view • Also, parameterized categories similar to queries • Previously issued queries persist as subcategories

  17. Mail • E-mail a good source of metadata-rich documents • Messages, e-mail addresses, people and groups can be modeled in RDF • Haystack agents can be used to filter e-mail to make it more manageable • Many e-mail management techniques applicable to documents in general and vice versa

  18. Storage Model • Objects in Haystack named by Uniform Resource Identifiers (URIs) • URLs are a subclass of URIs • Documents and web pages can be named by URLs • HTTP/FTP/WebDAV servers can then be used to store documents • Inefficient to store terabytes of “data” in RDF when existing storage solutions are effective

  19. Collaboration • Allow Haystack-Haystack and Haystack-Semantic Web information exchange • Filtration of imported data • Who’s the expert? problem • Privacy concerns • Different ways of organizing information between different parties • Can be used to model mailing lists, newsgroups, and groupware

  20. Ontological Conversion • Unlikely that everyone will agree on the same schemata • Ontological conversion converts from one schema to another • Can be implemented as Haystack agents that respond to metadata with “foreign” schemata

  21. Implementation • Written for Java 2 platform (JDK 1.3.1) • SWT (Eclipse) used for user interface components • Mozilla web browser • HSQL open source SQL database written in Java • Lucene (Apache Jakarta project) search engine written in Java • Tomcat (Apache Jakarta project) web server written in Java • Parts written in Jython, Java-based Python interpreter

More Related