1 / 33

Structural and Semantic Heterogeneity in Database Schema Integration

Structural and Semantic Heterogeneity in Database Schema Integration. SIXTH Conference of Department of Computing Wednesday 4 May 2005 David George. Presentation Content. Why is Integration necessary? Evolution of Integration Approaches

beniton
Download Presentation

Structural and Semantic Heterogeneity in Database Schema Integration

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Structural and Semantic Heterogeneityin Database Schema Integration SIXTH Conference of Department of Computing Wednesday 4 May 2005 David George

  2. Presentation Content • Why is Integration necessary? • Evolution of Integration Approaches • Barriers to Integration - Structural and Semantic Heterogeneity • New opportunity - Ontology and the Semantic Web

  3. Why Database Integration?

  4. Drivers for Data Integration • Global organisations with distributed data. • Organisations having legacy and new databases. • Organisational change e.g. business re-engineering and acquisitions. • Autonomous departments with disconnected systems requiring interoperability e.g. Financial Services. • Business Intelligence requiring: • decision-support systems. • customer analysis and marketing strategies. • data mining

  5. Schema Integration Local DB schema Global schema integration Queries Global Schema Schema 1 Queryinput Global:Local Schema Mapping Schema 2 Queryoutput Schema n

  6. Evolution in Integration Approaches

  7. Evolution in Integrations Global Domain Agreements Knowledge Digital media Visual/Spatial/Temporal Data [Kiosk/Geographic/Flights/Forecasting] Focus – Semantics Domain-specific Information Structured, Semi-structured Text repositories Focus - Syntax of data type, format & Schema constructs Data Structured DBs, Files Focus – Systems & Communications System Schema Integration Common Data Models Virtual Integration Single Ontologies Multiple ontologies, Inter-ontological Local Task Schemas Federated DBS Federated IS (inc Mediators) Information Brokering 1985 1995

  8. Federated DBMS approach External Schema 1.1 External Schema 1.2 External Schema 2.1 Common Data Model Federated Schema 1 Federated Schema 2 Export Schema 1.1 Export Schema 2.1 Export Schema 2.2 etc Component Schema 1 Component Schema 2 Local Schema 1 Local Schema 2 Application: Integration of business databases Component DBS 1 Component DBS 2

  9. FDBMS schema architecture External Schema 1.1 External Schema 1.2 External Schema 2.1 External Schema 2.2 Common Data Model Federated Schema 1 Federated Schema 2 Export Schema 1.1 Export Schema 2.1 Export Schema 2.2 Export Schema 3.1 Component Schema 1 Component Schema 2 Component Schema 3 Local Schema 1 Local Schema 2 Local Schema 3 Component DBS 1 Component DBS 2 Component DBS 3

  10. Mediator/Wrapper (Virtual integration) Network Internet Query Translation Mediator Application: Integrated access to Heterogeneous data

  11. Multiple Worlds Information Mediation Crime Statistics Demographics Property Sales School Rankings Information Brokering Search Query: “Find detached houses for sale under £300k with 2 bathrooms, 3 bedrooms, a local school rated in the upper quartile of govt. league tables, in a district with below-average crime rate and a socio-economically diverse population?”

  12. Barriers to Integration - Structural & Semantic Heterogeneity

  13. Recipe for Heterogeneity and Conflict Conceptualisations of the real world are influenced by the designers view of the Concept and Context to be modelled

  14. Schema Type Conflicts Name Address Title Name Book Topics Publisher Book-Topic Pub-Book Title Title Publication Keyword Code Code Pub-Keyword Research Area Publisher

  15. Taxonomy of Schema Conflicts

  16. Incoherence in Cardinality Invoice Invoice Invoice 1 n 1 Inv:Order Inv:Order Inv:Order m 1 m Order Order Order

  17. Abstraction and Schematic Conflicts

  18. Generalisation/Specialisation Conflicts Schema 1 Schema 2 Student Student S_Type (ID#,Name,Type,Course) U-graduate Graduate (ID#,Name,Course) (ID#,Name,Course) i.e. U-graduate in schema 2 represented at more general level in schema 1

  19. Specialisation Classification Conflicts Employee Criteria inconsistency Employee Gender Role Sex Service Person Person Degrees inconsistency <30 30-60 >60 <25 25-55 >55 Customer Customer Characterisation inconsistency Child Adult Senior Teen Parent G-Parent

  20. Aggregation Conflicts Aggregation used in schema 1 is represented by a set-of entities in schema 2 Also NB: mapping exists in only one direction Convoy Schema 1 (ID#, Av_Weight, Location) Ship (ID#, Weight, Location, Captain) Schema 2

  21. Aggregation Conflicts (contd) Employee(department) vs. Employee(division(department)) Component class of collection CarType(carMake, carDesign) vs. FamilyType(carMake, saloonSize) Aggregation Specialisation Person(address, tel) vs. Person(street, city, county, tel) Aggregation Composition

  22. Schematic Discrepancies Data:Attribute:Entity conflicts Stock DB1 Value (Date, StockCode, ClosePrice) (stockItem) Stock DB2 (Date, StockItem1, StockItem2, …StockItemn) (ClosePrice) Attribute StockItem1DB3 StockItemnDB3 (Date, ClosePrice) (Date, ClosePrice) ..….. Entity Entity

  23. So where next?

  24. New Solutions - Ontologies and the Semantic Web

  25. Ontologies in Computing • Formal vocabulary of a “universe of discourse”. • Ontologies define: • concepts and their attributes • relationships between concepts • constraints on those relationships “An Ontology is a formal, explicit specification of a shared conceptualization” (Gruber, 1993 & Borst, 1997)

  26. Biblio-Thing Document Book Miscellaneous-Publication Proceedings Edited-Book Technical-Manual Cartographic-Map Computer-Program Doctoral-Thesis Newspaper Journal Master-Thesis Magazine Bibliographic Data Ontology (extract) Agent Person Organization Author Publisher University Thesis Periodical-Publication http://www.ksl.stanford.edu/knowledge-sharing/ontologies/html/

  27. Types of “ontologies” • DE BRUIJN, J. (2003) Using Ontologies - Enabling Knowledge Sharing and Reuse on the Semantic Web [online]. Innsbruck, Austria, DERI – Digital Enterprise Research Institute. Available from: http://www.deri.ie/publications/techpapers/documents/DERI-TR-2003-10-29.pdf. [Accessed 15 February 2005]. • Value restrictions: values of properties are restricted (e.g. by a datatype). • General logic constraints: values may be constrained by using values from other properties. • First-order logic constraints: very expressive constraints between relationships such as: • disjoint classes, inverse relationships, part-whole relationships.

  28. Semantic Web DE BRUIJN, J. (2003) Using Ontologies - Enabling Knowledge Sharing and Reuse on the Semantic Web [online]. Innsbruck, Austria, DERI – Digital Enterprise Research Institute. Available from: http://www.deri.ie/publications/techpapers/documents/DERI-TR-2003-10-29.pdf. [Accessed 15 February 2005].

  29. Semantic Web Tower OWL: Clients in S1same-As Customers in S2 RDFS: person X is a LivingPerson RDF: person X is named “Bill". OWL Ontology Language “The Semantic Web is an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation” Tim Berners-Lee et al., 2001

  30. RDF Example Object, Attribute, Value Triple: Predicate, subject, object

  31. End of Presentation

  32. Semantic Data Model

  33. Evolution in Interoperability Knowledge Global Domain Understanding comprehensive metadata and ontology approaches Digital media Visual/Spacio-Temporal Modelling Scientific/Engineering I n t e r o p e r a b i l i t y Information Key focus on: Semantics & more domain-specific Multi-modal sys Structured, Semi-structured (HTML etc) Text repositories Understanding use of metadata & schematic heterogeneities Data Key focus on: Syntax – data types/format Structure – schema constructs O-O sys Structured DBs, Files System E-R sys Key focus: Systems & Comms. Common Data Models Schema translation & Integration Schematic & metadata relationships, Wrappers, Single Ontologies Multiple ontologies, Inter-ontological, Metadata standards Local Schema MDBMS / Federated DBS Fed. Inf. Systems / Mediators Mediator / Information Brokering 1985 1995 Architectures

More Related