1 / 23

Planning to Gather Information

Planning to Gather Information. Chung T. Kwok and Daniel S. Weld Department of Computer Science & Engineering University of Washington Presented by Hemanth Krishnappa. Abstract. It is all about Occam

jorn
Download Presentation

Planning to Gather Information

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Planning to Gather Information Chung T. Kwok and Daniel S. Weld Department of Computer Science & Engineering University of Washington Presented by Hemanth Krishnappa

  2. Abstract • It is all about Occam • Occam is a query planning algorithm that determines the best way to integrate the data from different sources • Input for Occam – library of site descriptions and user query • Output of Occam – generates one or more plans that encode alternative ways to gather requested information

  3. Introduction • We can access any kind of data or information from Internet and World Wide Web. • Any type of information is available somewhere but most users can’t find it. Even expert users waste lot of time and effort in searching appropriate information. • Most researchers have tried doing something for this problem. • Occam is one of the algorithm used for this purpose. • Occam automates the process of locating relevant information from a stored source and combining them appropriately to answer user’s query or request

  4. Example • Find out the names of all the people in an office • If this information is in a database, then gathering this information is very easy. Suppose if no such database exits and we have only two information sources namely: • finger: returns the names of the people given their e-mail addresses • userid-room: returns e-mail addresses of all the occupants in an office • We can answer the query by first executing the userid-room command and then executing the finger command depending upon the user we need to contact. • Occam planner reasons about the capability of those information sources namely finger and userid-room and generates multiple plans in order to gather as much information as possible.

  5. Context • Occam uses an action language to represent information sources • Occam uses knowledge preconditions as operators • Occam is different from other AI planners because it models the information state and not the world state. Information state is a description of information collected by Occam at a particular stage in planning. • Occam is more expressive than multidatabase systems as it can model incomplete information in sources • Occam can extract information from both legacy systems and relational database systems.

  6. Representing Sites and Queries • Occam allows the user to interact with Internet services through a single, unified, relational database schema called world model. • Example: • Email(F, L, E) : this is a relation schema where F, L and E are attributes representing • F – First name, L – Last name and E – Email address. • Office(F, L, O) • F – First name, L – Last name and O – Office

  7. Information-Producing Sites • The connection between Internet services and data sources is done by Occam. Queries should be modeled between output and relations in the world model. Both are achieved by operators which have two parts • Head – consists of predicate symbol denoting the name of the operator and an ordered list of variables called arguments. • Body – conjunction of atomic formulae where the symbols denote relations in the world model. • op(X1,………Xn) => r1(…..,Xi,……) /\...../\....rm(……,Xj……) • finger(F, L, $E, O, Ph) => email(F, L, E) /\ office(F, L, O) /\ phone(O, Ph)

  8. If E is bound to “sam@cs”, then following tuples would be returned (“Sam”, “Smith”, “sam@cs”, “501”, “542-8907”) (“Sam”, “Smith”, “sam@cs”, “501”, “542-8908”) This shows that office(F, L, O) appears in the body of finger and the office “501” has atleast two phones One can never retrieve all the tuples with a single operator, therefore one must execute multiple operators to be sure that he has retrieved as many tuples as possible.

  9. Information Gathering Queries Queries are similar to operators They have head and body but the direction of implication is reveresed Query-for-first-names($0, F) <= office(F, L, O) The above query has 2 arguments O and F. O must be bounded. If Joe Reseacher and Jack Goodhacker are the occupants of office “429”, then the output of query would be two tuples (“429”, “Joe”) (“429”, “Jack”)

  10. Plans and Solutions • The plan has same representation as that of an operator whose body is conjunction of operator instances. There are two ways of viewing a body of plan • – one is logical conjunction where the order is unimportant • - other is procedure view where the order is very important • We say a plan p(X1…..Xn) => O1/\.....Ok is a solution to the query • q(Y1……Yn) <= r1(…..,Yi…..)/\....../\ rm(…..,Yj…..) if it follows the two criteria • The binding patterns of the operator instances are satisfied • The following implication must hold • For all c1…..cn • p(c1….cn) => q(c1…..cn) • Where ci is a constant.

  11. For example: • Query-for-first-names($O, F) => office(F, L, O) • if the office number is “429” • returns (“429”, “Joe”) • (“429”, “Jack”) • Binding pattern is satisfied as 429 binds to O • Every tuple returned by the plan satisfies the query.

  12. Planning to Gather Information

  13. Fringe – contains different sequences Sol – contains all the solutions or plans discovered At each stage, a sequence operator is removed from fringe and it is being expanded by the function Instantiate(Op, B). Later, the new sequences are added to the fringe. The function find solution determines if any of those sequences in the fringe can be elaborated into a solution plan. Newly discovered solutions are added added to Sol, but in any case every sequence is kept on fringe too.

  14. Finding Solutions

  15. The Find Solution function tests each sequence in the fringe to see if it encodes one or more solutions to the query. This function checks to see whether there exists any plan for the sequence that are solutions to the query. As we studied before, every plan has to satisfy two conditions. If those two conditions are satisfied, then it has found a plan that forms the solution of the query.

  16. Redundant Solutions • The last line in the function Find Solutions checks to see if a plan is redundant before it is being added to the solutions. • Why should we check redundancy? • If a sequence of operator instances corresponds to a solution, then every supersequence will also generate the same solution • Occam keeps all sequences in fringe. Even those sequence has produced solution

  17. Reducing Search • Use of type information and constraint satisfaction speeds FindSolutions • Duplicated operator instance pruning reduces the branching factor of the algorithm • Shuffled sequence pruning achieves the efficiency of partial-order representation without complexity • After the implementation of the above three optimizations, there was a great performance difference in Occam.

  18. It was demonstrated by taking 5 problems from 4 domains. Each experiment on vanilla Occam as well as standard Occam. The above test shows that standard Occam provide two orders of magnitude speedup

  19. Advantages of Occam • Integrates both legacy system and relational database • Reasons about the capabilities of various information sources • Handles partial goal satisfaction i.e. gathers as much data as possible when it can’t gather exactly what the user has requested • It is sound and complete • Efficient

  20. Conclusion The execution system which is a user specified utility function balance the time to execute a plan against the number of tuples it is expected to return. Plan quality is the major focus of Occam thus providing the end users optimal information

  21. Questions ?

  22. Thank you

More Related