1 / 40

Physical Design Patterns in Information Systems

Karim Ali & Sarah Nadi CS848 – Spring 2010 July 14 th , 2010. Physical Design Patterns in Information Systems. Outline. Database lifecycle Elements of Physical Design in Traditional Databases Physical Design of Different Systems Memory Based Relational Database Systems (MMDB)

elton
Download Presentation

Physical Design Patterns in Information Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Karim Ali & Sarah Nadi CS848 – Spring 2010 July 14th, 2010 Physical Design Patterns in Information Systems

  2. Outline • Database lifecycle • Elements of Physical Design in Traditional Databases • Physical Design of Different Systems • Memory Based Relational Database Systems (MMDB) • XML Databases • Data Warehouses • Conclusions & Future Work Karim Ali & Sarah Nadi

  3. Database Lifecycle Karim Ali & Sarah Nadi

  4. Elements of Physical Design Traditional Disk Based Relational Databases Karim Ali & Sarah Nadi

  5. Traditional database systems • Disk Based Relational Databases (RDBMS) • Data is stored on disk as relations (i.e. tables) • Data is organized based on a relational model • Elements of physical design: • Index Structures • Materialized Views • Partitioning • Clustering • Data compression, striping, mirroring & denormalization Karim Ali & Sarah Nadi

  6. 1. Index Structures • Data needs to be organized for quick searching • Need to minimize expensive I/O operations • Examples: • B+trees • Bitmap Index • Hashtable • ... Karim Ali & Sarah Nadi

  7. 1. Index Structures Cont’d • B+tree Karim Ali & Sarah Nadi

  8. 1. Index Structures Cont’d Karim Ali & Sarah Nadi

  9. 2. Materialized Views • Repeated complicated queries should not have to be executed every time • Saves execution time & I/O reads by pre-computing the results • Materialized views are stored on disk • Rewriting queries using materialized views speeds up execution Karim Ali & Sarah Nadi

  10. 3. Partitioning • Divides the data into related partitions • Horizontal Partitioning: divides tables into sets of rows according to a specific attribute (E.g. Date ranges) • Vertical Partitioning: divides tables into sets of columns • Reduces table scan time • Improves performance Karim Ali & Sarah Nadi

  11. 3. Partitioning Cont’d Types of Horizontal Partitioning: Karim Ali & Sarah Nadi

  12. 4. Clustering • Grouping related items together for • Efficiency of access • Resource utilization • Achieved on the page level on disk • Pros • Very useful for multidimensional queries (e.g. group by) • Reduced I/O operations • Reduced CPU cost • Cons • Difficult to define clustering keys, clustering scheme, and the granularity of clustering Karim Ali & Sarah Nadi

  13. 5. Other Methods • Data Compression: • Fitting more data into a fixed amount of space • Striping: • Distribute data that is accessed together across multiple disks • Mirroring: • Duplicating the data to multiple disks • Denormalization: • Refine global schema to reflect query and transaction requirements Karim Ali & Sarah Nadi

  14. Physical Design of Different Information Systems Karim Ali & Sarah Nadi

  15. 1. Main Memory Databases (MMDB) Karim Ali & Sarah Nadi

  16. MMDBOverview • Primary copy of data resides in main memory • Cheaper to access main memory • MMDB has better performance • Usually has an archived copy of the data in case of crashes Karim Ali & Sarah Nadi

  17. MMDB1. Index Structures • Use: • Reduce overall computation time without using too much extra space • Factors to consider: • Read/Write operations are cheaper • Should be cache conscious • No need to store data in the index structure • Categories of indexes used: • B+trees • T Trees • Search Trees Karim Ali & Sarah Nadi

  18. MMDB1. Index Structures Cont’d Karim Ali & Sarah Nadi

  19. MMDB2. Materialized Views • Not beneficial to MMDB • Cost of computing complicated queries is much less • Maintenance costs will outweigh benefit Karim Ali & Sarah Nadi

  20. MMDB3. Partitioning • Not necessary in main memory • Used for the secondary storage on disk to speed up reload • Horizontal partitioning • Vertical partitioning Karim Ali & Sarah Nadi

  21. MMDB4. Clustering • Not applicable to MMDB • Sequential access in main memory is not cheaper than random or dispersed access Karim Ali & Sarah Nadi

  22. 2. XML Databases (XML DBs) Karim Ali & Sarah Nadi

  23. XML DBsOverview • XML-enabled DBs (E.g. SQL Server, Oracle): • Maps XML documents to relational tables • Useful for data-centric XML documents • Native XML DBs (E.g. eXist, Sedna) • Useful for document-centric, semi-structured XML documents • Data structures store actual XML nodes with pointers between them Karim Ali & Sarah Nadi

  24. XML DBs1. Index Structures • XML-enabled DB • Shred XML data in a relational table with columns: ORDPATH, tag, node type, value, path ID • Use a B+tree index based on combination of primary key of base table & the ORDPATH Karim Ali & Sarah Nadi

  25. XML DBs1. Index Structures Cont’d • Native XML DB • Index for document collections • Index for nodes • Numbering schema for the XML nodes • B+tree (or variations) used on the numbered nodes Karim Ali & Sarah Nadi

  26. XML DBs2. Materialized Views • Greatly enhances performance • XQuery and Xpath query results are materialized • Query rewriting is more tricky due to semi-structured nature & complicated querying languages Karim Ali & Sarah Nadi

  27. XML DBs3. Partitioning • Horizontal partitioning • Based on node type • Through inlining Karim Ali & Sarah Nadi

  28. XML DBs4. Clustering • Elements & sub-elements are clustered together • XML documents are clustered based on structural similarity Karim Ali & Sarah Nadi

  29. 3. Data Warehouses Karim Ali & Sarah Nadi

  30. Data WarehousesOverview • Collection of data and decision support technologies • Used in: • Retail: user profiling • Finance: claims analysis, risk analysis, credit card analysis, and fraud detection • Healthcare: outcomes analysis Karim Ali & Sarah Nadi

  31. Data WarehousesDesign Karim Ali & Sarah Nadi

  32. Data WarehousesChallenges • Data is usually • Extremely large • Multi-dimensional • Priority for aggregated and summarized data • Ad-hoc and complex queries • Expensive operations: aggregation, and joins • The fact table in the star schema participates in every join Karim Ali & Sarah Nadi

  33. Data Warehouses1. Index Structures Karim Ali & Sarah Nadi

  34. Data Warehouses2. Materialized Views • To materialize or not? • Workload characteristics • Cost for incremental update • Storage requirements • Pros • Improves performance through fast lookups • Useful for rollup and drilldown operations • Cons • Not applicable to all queries (e.g. ad-hoc queries) Karim Ali & Sarah Nadi

  35. Data Warehouses3. Partitioning • Horizontal & Vertical partitioning are used • Pros • Ability to manage larger databases • Enhances query performance over large tables • Enables parallel processing • Facilitates data compression • Cons • Complexity: managing partitions • Efficiency: number of partitions affects the performance of meta data operations (e.g. browsing the data cube definition) Karim Ali & Sarah Nadi

  36. Data Warehouses4. Clustering • Data is clustered by nature Karim Ali & Sarah Nadi

  37. Conclusions & Future Work Karim Ali & Sarah Nadi

  38. Karim Ali & Sarah Nadi

  39. Future Work • Physical design of industrial systems • Interplay between systems • Automating physical design • List of the open problems in physical design Karim Ali & Sarah Nadi

  40. Thank you Karim Ali & Sarah Nadi

More Related