180 likes | 302 Views
Infomaster is an advanced information integration tool designed to tackle the challenges posed by the vast amount of information available online. It addresses issues such as fragmentation, heterogeneity, and the need for semantic understanding. By utilizing intelligent agents, Infomaster provides integrated access to evolving information sources, allowing users to query across diverse databases uniformly. The tool is tested in various application areas, including newspaper classifieds and product catalogs, ensuring efficient data retrieval and effective format translation.
E N D
Infomaster: An information Integration Tool O. M. Duschka and M. R. Genesereth Presentation by Cui Tao
Introduction • Huge amount of information online: • Distribution: Not every query can be answered by the data in a single database • Fragmentation: horizontal, vertical • Heterogeneity • Notational heterogeneity: • Different access language and protocol: Parsing HTML, SQL, OQL, Z39.50 • Conceptual heterogeneity: • Semantic mismatches • Instability
Introduction • Intelligent agents • Search and find desired information • Convert formats • Translate different context • Etc… • Not feasible yet • Considerable research in ontologies and natural language understanding is required
Introduction • Infomaster: an information integration tool • Provide integrated access • Manage evolving information sources • Add new information sources • Remove outdated information sources
Tested Application Areas • Newspaper classifieds • Provide a uniform search interface • Gather corresponding classifieds from all relevant newspapers • Product catalogs • Provide terminology translation • Campus databases
Interface Base Descriptions of Relationships • Interface relation & Site relation: in the terms of Base relation • Interface relation v.s. Base relation:
Base Site Base Descriptions of Relationships • Site relation v.s. Base relation:
Base Site Base Descriptions of Relationships • Site relation v.s. Base relation:
Query Processing Example: BMWs built in 1996 that are for sale for a Price below their average market value.
Reduction: Interface relations Base relations • Simple: User’s query --- Interface relation --- Base relation • Example rewritten query:
AbductionBase relations Site relations • Site relations are expressed in terms of base relations, but not vice versa • Query rewritten problem: answer queries using views • Abduction: use a standard model elimination theorem prover
AbductionBase relations Site relations : The set of all descriptions of the site relations : A set of site relations : The rewritten user query after the reduction step
AbductionBase relations Site relations • The example query plans:
Optimization Assume: All ads in sjmn are in sfc
Conclusions • The first integration system: • Arbitrary positive relational algebra user queries • DB description • Efficient optimization by use: • Integrity constraints • Local completeness information • Flexible Use of query planning: • Expressive description language • Constraint • Background theories
Related Works • Information Manifold project and SIMS project: • Explore the use of descriptions logics for describing information sources • Occam project • Use general AI planning techniques to generate information gathering plans • TSIMMIS project • Use pattern matching techniques to match user queries and predefined queries.