Relational database design

Relational database design Lecture 4 Physical database design, part I Balancing of user demands Secondary storage structures for relational databases Choice of allocation and access methods Configuration of database management system and database Parallelism Views Indexes Join-strategies Optimization of SQL transactions Maria Bergholtz SU/KTH

Positioning and examples (overview is given in Database systems chapter 16) Data- base “System” Physical design Logical design • Models • Analysis patterns • Synthetical and analytical • design • Secondary storage structures • Performance • optimization • Multimedia ? ANIMAL belongs_to SPECIES = < > Weight Mean_weight 0..* 1 1 0..* FLOCK 0..* 1 MEMBERSHIP Date Number Tabell 2 Table1 Name Address Phone Lisa Byvägen 3 11111 Olle Solgränd 2 22222 Pelle Solgränd 4 23456 Text, Video, Audio, Image Extender

Why physical database design ? • DBMS – complex programs • Different architectures (multi processor environment, client -server) • Multimedia (Image, Audio, Text, Video, mm) • Different platforms (Unix, NT, Mac, Windows, IBM ..) • Large databases • Many tables • Large tables • Large data types • Many rows • Many columns • Many users • Different demands • Different times of day: day, night, 24 hours • Read and/or write • Different types of access towards different tables

Where are the bottlenecks ? • Design - 45 % • Data modeling, Tables, Storage structures, Index • Application - 40 % • Transaction handling, SQL, Partitioning • Configuration 10 % • Server parameters, Locks, Transaction logging • Operating system 5 % • Disk layout, System parameters, Process priority, Memory management

Analysis of the USAGE of data • Analysis of the data model is a prerequisite in order to succeed with the physical design of the model (transactions, tables, change…) Analysis of the columns Analysis of the rows • What columns are read only? • Are all columns read/updated in parallel? • What users use what columns ? • Does some columns contain data that shall not be accessed by all users ? • What attributes/columns may serve as keys? • What are the domains of the columns ? • Row size ? Number of rows per table ? • What rows are read only? • Does updates happen regularly, once a month, day, each Friday, randomly? • Is the whole table always read in sequence ? • Is it possible to access the rows directly via the key? • What users use what parts of the database ?

Analysis of future change in usage of data • Identify • Growth rate of tables • Need for new tables • New groups of users with respect to existing tables • Need for new info in existing tables?

In order to determine what is good-enough performance, goals wrt access times etcetera, must be defined • How many transactions per minute ? • WHAT shall be optimized? Reading, writing, loading, reorganizing or what? Conflicts will exist! • What response times are ’acceptable’ ? • How many parallel users ? • Do not optimize if not needed!

Factors effecting performance in a relational database management system • DB design • Hardware • CPU, PM, buffer pools • Data structure • File organization and access methods • Optimizers • Parallelism • (Recovery)

Goals for physical database design cont. • Optimization against a defined transaction set is performed • Division of measures: • Program independent (not table changing) • Program dependent (table changing) • Other measures

Non table changing measures • Choice of DBMS • Changing storage structures • Other types of physical allocation of data • Configuration of DBMS and DB • Temporary tables • Secondary indexes • SQL - optimization

Data- base “System” Discs • The usual secondary storage memory type with a mobile read/write mechanism.. • Time for: • Move the read/write device to the correct cylinder (seek time) • Rotation time • Actual reading of a page • At writes one/several discrete rotations of the disc is needed. ”OVERHEAD” SYSCAT.TABLESPACES: OVERHEAD (milliseconds) = Seek-time + 0.5 * time_to_rotate_360_degrees, where time_to_rotate_360_degrees = (1/number of 360_degree_rotations _per_minute) * 60*1000 TRANSFERRATE (time in milliseconds to read a page from disc to memory): 1000*page_size_in_byte/number_of_transferred_Mb_per_second*1024*1024 Without parallelism, i.e. a table space = a container = 1 disk

Storage structures for relational databases • Secondary memory structures for relational databases • A database consists of a number of equally sized pages (blocks). Each relation is stored on a subset of these pages. Each page has a page number. • A page may contain several tuples. A tuple is usually contained within a page (exceptions are tuples containing ”large” data types as column domains). • A page may be filled completely or partly at load time of table. • There are different types of pages, for example ”index pages” and ”data pages”, which mirrors the fact that different types of information is grouped physically on different places.

Block, ”access-closeness” • Since positioning the read/write device takes the longest time, it is necessary to group rows into blocks or pages. • Typical block/page sizes are 4 and 8 Kb (DB2 supports both 4K-, 8K-, 16K and 32K-pages. For 4K-pages tables may contain max 500 columns and max 64 Gb in total. If larger pages are chosen, a table may have 1012 columns. A row may with 8K pages be max 8101 byte (i.e. a bit less than 8 * 1024). • Pages on the same track/cylinder takes less time to read (if red sequentially).

Tabell 2 Table1 Lisa Byvägen 3 11111 Olle Solgränd 2 22222 Pelle Solgränd 4 23456 Storage-hierarchy in a relational database [Gulutzan/Peltzer2003] Database Depending on DBMS a Page may be called data-block, blocking unit, control interval, row group etc. • Page – smallest unit of: • read, write • locking • caching • administration (id, overflow- • pointers, free space, • etcetera) Table spaces • All pages in a file have the same size • Page size is limited to multiples of • 1024 byte, default DB2 is 4K • Pages usually contain a discrete set of • rows • Often, and always recommended: all • rows in a page belong to the same table. • LOB: data (TEXT, IMAGE, etc.), are • usually stored in separate pages though. Files Extents Pages

Tabell 2 Table 1 Lisa Byvägen 3 11111 Olle Solgränd 2 22222 Pelle Solgränd 4 23456 Extents Database • Extent = number of pages stored in physical sequence. • Files are extended an extent a • time • Extents are units of • ALLOCATION rather than • reading or writing. Table spaces Files However, ALLOCATION- units and I/O units have relationships. If they are not of equal size, then at least that what is red/written, i.e. the LOCIGAL pages should be multiples Of the extent:s (or vice versa). Extents Pages

Files Tabell 2 Table1 Lisa Byvägen 3 11111 Olle Solgränd 2 22222 Pelle Solgränd 4 23456 Database • File - at least two ways to view the concept of a file: logically (as a group of records) or physically as a group of extents. A file is always manifested as a number of extents. • Many DBMS’s does not recognize the file concept. • Or allow so called mixed extents, i.e. extents with information about SEVERAL tables. Note: A partition is also a number of (continuous) extents (often a file). Table spaces Files Extents Pages

Table Space Tabell 2 Table 1 Lisa Byvägen 3 11111 Olle Solgränd 2 22222 Pelle Solgränd 4 23456 • Table space – file or group that contain (related) information (often a table). • CREATE TABLESPACE tbspace • MANAGED BY DATABASE • USING (’d:\filspace1’) • A table space may contain one • or several tables, a table and its index and/or • both. • In other words, it is possible to mix extents • from different ’objects’ in one and the same • table space. From a performance point of view • this might be good or bad. • En table space may separate concerns: • Different data in different table • spaces • Separate user data from system • data • Separate extents of different size • Separate ’back-up’ info from other info. Database Table spaces Files Extents Pages

Partitioning is the activity of splitting an object (table, table space, index, etc.) on two or several physical ”locations”, for example different discs. A partition is a group of extents allocated physically in sequence after each other (often called a file). Partitions - partitioning Replication: simply copies of various objects: tables, parts of tables etc., placed in different physical units. • + • In a multi disc environment this • makes it possible to reduce I/O • time • Less locks needed • - • Not necessary if the users are few • May increase I/O time because • related things are spread over • several locations

Organization and access methods • Organization methods • What algorithm to use to place rows on discs - Sequential - Direct addressing - Pseudo addressing (hashing) - Indexing - Clustered organization • Access methods • How to search the tables • Hashed access • Sequentially • Binary search • Via index • Via direct addressing • It is possible to organize a file in a certain manner and then search the file in a different manner.

What parameters effect choice of organisation and access method? • Required spaceto store a row • Timeto: • Access a row in random • Access the next row given an existing row in memory • INSERT a new row • DELETE a row • UPDATE a row • Read the entire table

Relative addressing • Variant of direct addressing, i.e. the address is derivable from the id • Absolute direct addressing • Address = id • Derived direct addressing • The address is derived from the id. Suppose it is possible to store 10 rows per page, i.e. (id DIV10) + 1 give what page, (id MOD 10) give location within a page. For example: Artnr 32 is stored on page 4 byte 2) • Relative addressing • Id give the posts placement in the file. Via the file control block (where the start position of the file is stored) the place relative to file start can be calculated.

Storage structure HASH • Storage position is a function of the id • Different id may give rise to the same storage position (Synonyms) • If bucket size > 1 two synonyms may be stored at each hashed address. If not, the second synonym is stored somewhere else (Overflow) • Close two one access for each row (exactly one read access for each row placed in the home area)

Hashing Overflow-pointer Bucket size = 2 Relative address 0 362 11 Home area consists of 1 Relative addresses 0-5 2 17 Posts with Id 98, 67 3 98 67 6 and 93 are hashed to one and 4 the same relative address: 3 5 513 6 93 Overflow-area starts at 7 address 6 (in this case home 8 area and ov-area in the same file) 9

When to hash? • Medium sized and large tables (>= 6 pages) • Tables often search in “random” order • If the id columns are often searched for • Seldom updated tables

What columns to hash? • Columns with an even distribution of values • Columns with few synonyms • Often searched columns • Do not hash often updated columns

Hash, pros and cons+ - • Fast access, close to 1 access per accessed row • Good if the id-value is large, compared to pure relative addressing where large id yield very large address spaces • Hard to foresee overflow • Data not sorted • Demand the whole key is given when to search • The home area is static, not good for growing files/tables

Storage structure Btree • The most common one in relational database management systems • Flexible solution that allow direct access as well as sequential access on indexed attributes. • B – trees are balanced (= equal number of accesses required to search a given row), the structure is self reorganizing.

All nodes but the root are at least • half full • ”Leaf pointers” in sequence • ”Balanced” tree • The nodes always have k ”children” • and (k-1) keys. 85 B+-tree (3 index-levels here, root, intermediate node, leaf node. Plus data nodes. 31 40 90 92 This node has 2 keys and 3 pointers to child nodes. 11 12 23 31 35 40 61 70 85 87 88 90 91 92 93 94 11 boll 35 .. 61 ... 85 ... 93 ... 90 ... 12 ... 31 ... 94 ... 92 ... 23 ... 88... 70 ... 87 ... 91 ... 40...

To insert a new row means (simplified) to locate what leaf node the new row should be put into.. If there is space enough, the key is written in this leaf node. The post itself may be placed where there is space and the pointer Post with id = 15 is inserted here in the leaf node is set to point to this address. Is the leaf node full a node split will occur, i.e. one leaf node is split into two, which then share the values of the original node (new key included) so that both new nodes will be half full. In the parent node is placed a pointer to the new leaf node. If the parent node is also full this node is also split. This might lead to new node splits all the way up to the root. If the root is split, a New root is added to the tree. In which case the B-tree has Grown with one index- Level. B+-tree, INSERT 85 This node has 3 keys and 4 pointers to child nodes. 23 11 15 31 40 61 85 87 88 91 92 94 12 35 70 90 93

Btree, pros • Good for dynamic tables (lots of updates will not affect the structure in a negative way and reorganization will have to occur less frequently) • No problems with overflow on data pages • UPDATE of PN does not move the row • Foreseeable ’access time’ (the tree is balanced, i.e. same number of accesses to reach a leaf.

Btree, cons • Not necessary for non-changing tables • Overkill for small tables

Clustered organization • The rows of the table are sorted in a certain order, i.e. the so called base-table is clustered in an order given by a certain column (eye-colour, LOCATION, size, …) • An extra sort is avoided • Clusters often combined with indexes

When to cluster? • Large and medium sized tables • Tables that are often sorted in a certain sequence • Tables often red in a certain sequence • Tables often searched for a certain series of values • Tables where the clustering column is not changed often • Nearly always good to cluster but how to choose ‘best’ clustering column?

Non table changing methods: • Choice of DMBS • Change of storage structure • Other types of physical allocation of data • Configuration of DB and DBMS • Temporary tables • Secondary indexing • Optimization • DB2: • Configuration • Architecture • Physical organization and access methods • (Secondary-) Indexing • Optimization and SQL reformulation

Configuration of DB2 • Configuration is done via configuration files related to three different levels of DB2 • System:Operating System Configuration File • Instance(s):Database Manager Configuration File • Database:Database Configuration File

Configuration parameters • Configuring your Database and Database Manager • There are more than hundred configuration parameters in DB2, which are used to control your system. • These parameters fall into two general categories: • Database Manager Configuration • Database Configuration • Some parameters are more important than others. In the Administration Guide chapter 19, you can find • all the parameters available in DB2, a detailed description and recommendations. To give you an idea • of these parameters, some of them are shown below.

How change configuration parameters? GUI The configuration parameters can be accessed through the Control Center by right-clicking on the Instance (Database Manager) or Database and choosing “Configure…” from the menu. In the GUI version the parameters are nicely categorized by functionality. Command Window Run: DB2 GET DBM CONFIGURATION to get the configuration parameters for your Database Manager (DBM) and DB2 GET DB CONFIGURATION FOR <DATABASENAME> to get the configuration parameters for your database. To modify a Database Manager configuration parameter type: DB2 UPDATE DBM CONFIGURATION USING <PARAMETER> <VALUE> To modify a Database configuration parameter type: DB2 UPDATE DB CONFIGURATION FOR <DATABASE> USING <PARAMETER> <VALUE> Example: DB2 UPDATE DB CONFIGURATION FOR X62Maria USING SORTHEAP256

Configuration database performance

Relational database design