1 / 34

Building Things : Disks Databases

4. Building Things : Disks Databases. Introduction. This is a mix and match section It contains all the information that does not have a home elsewhere in this presentation It covers Disk layouts Disk Farms Striping on Striping Database build. Disk Layouts.

talen
Download Presentation

Building Things : Disks Databases

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 4 Building Things : DisksDatabases

  2. Introduction • This is a mix and match section • It contains all the information that does not have a home elsewhere in this presentation • It covers • Disk layouts • Disk Farms • Striping on Striping • Database build

  3. Disk Layouts • Basically in an IQ-M system you can have a number of different disk configurations • We consider the following • Simple SCSI (or UDMA-[1]66) Disks • Basic Disk “Farms” (Collections of raw disks) • RAID Systems (NT RAID, Clariion etc.) • Disk Subsystems (EMC, MTI Guardian etc.)

  4. Basic Disk Farms • Here we are considering a number of “dumb” disks • There is no intelligence in the disks – or the controllers • Here is where we use IQ-M striping • Here we have a number of disk drives (on various controllers) • We stripe across a series of disks

  5. Disk Striping Basic Disks 1-dimensional IQ Striping In the 1-dimensional striping we are spreading the data writes (and reads) across a series of disk drives – limiting the overall disk head movement.

  6. Problems with Disk Striping • IQ-M has to make location decisions as to where to write and read the data • It is not as simple as writing to “bit bucket” or a serial list of disk blocks • It is faster than simple 1 disk = 1 device • The performance hit is measured in extra micro seconds per read/write – but with a 10 TBytes database this could be important

  7. In RAID 1 – Mirror, we can consider that Disks A and B are mirrored and Disks C and D This improves read and write performance, as the controller will pick which disk heads are closer to the data required. Also there is an improvement in data security – 2 copies A D A D C B C B RAID – 0,1 Redundant Array of Inexpensive Disks Raid 1 - Mirror In RAID 0 – Stripe, we can spread the reads and writes as per the last slide Raid 0 - Stripe This dramatically improves the read and write performance, but does nothing for the data security

  8. In RAID 0/1 – Stripe and Mirror, we can see that the stripe sets on A and B mirror to C and D This gives security – in the mirroring (copies) of the data, and improved performance due both to the two copies and the stripe sets A D A D C B C B In RAID 5 – Csum/MR, The data is in “stripes” across the disks then a checksum is written to the final disk This gives a small performance improvement for reads, no improvement for writes (in fact it will slow writes down!). The only bonus is security. The system will allow a disk failure in the pack without requiring the 2 to 1 overhead of mirroring systems RAID – 0/1 and 5 Redundant Array of Inexpensive Disks Raid 0/1 – Stripe and Mirror Raid 5 – Checksum Multi-read

  9. Other RAID • There is RAID 2, 3 and 4 • These are variations on the mirroring/striping and checksum theme • There are also proprietary RAID schemes • RAID S The is EMC’s RAID system • RAID M This is the MTI RAID Scheme • We will come on to talk about proprietary schemes in a few slides time

  10. Well is RAID useful ? • Yes it is – mainly ! • If the RAID system handles the data distribution at the hardware level (with RAID controllers) then RAID is both fast and safe • Windows NT (and 2000) can drive RAID through the operating system • This can be slower than IQ-M performing the striping • Watch out for large numbers of reads – and individual read performance slowing down • Also Avoid RAID 5. It is very slow for writes

  11. Disk Subsystems (emc, HDS) • These are disk arrays where we have no idea what the internal organization of the disks is (and nor do we care) • Generally we can consider that we are writing to memory (and usually we are!) • The tuning of the disks is usually best left up to the Hardware Support guys

  12. Striping on Striping • There is one very contentious area in disk organisation • This is 2 dimensional disk striping or striping on stripes • This is driven by using an operating system or RAID unit to provide 1 dimensional striping on a series of disks • Then we apply IQ striping onto the existing striping • 2 Dimensional stripes • Do it right – it’s very fast, do it wrong and it isn’t!

  13. A diagram Operating System Striping AS IQ-M Striping The reason this works is now the individual disks only have a very small amount of data to read and write so we rely on very fast disk -> processor communications If this is done properly it is the fastest disk access for ASIQ-M The secret, if there is one, is to make the block size the same as the “micro” stripe size… But this needs further experimentation

  14. Disk Striping • By default Disk_Striping is set ON for RAW devices and OFF for file system devices • We should be using RAW devices – they are (generally) a lot faster and potentially safer than file system devices • You may want to play with this parameter if you are running devices on disk farms or RAID array systems

  15. File system Storage • If you must have the IQ Store on Operating System File System • OS_File_Cache_Buffering can be set off for Solaris and Windows NT (and 2000) • However the system will slow down in the following areas • IQ Page Size < file system block size • During Loads • Solaris > 4Gb memory • The description in the manual is awesome!

  16. Create Database • OK so we have built the device or devices – about time we considered some of the options to create the database • Of all of the options when issuing a CREATE DATABASE command the following slide details the most important

  17. CASE - 1 • CASE • CASE RESPECT is the fastest • There is a 10-20% hit going to CASE IGNORE • Implications to FP Indexes • Regardless of RESPECT vs. IGNORE all 1-byte and 2-byte FP indexes store all the binary values for the data • So ABC, abc, Abc, Abc are all stored even for CASE IGNORE

  18. CASE - 2 • Remember because we store all bitmaps we can go from 1-byte to 2-byte FP, or 2-byte to flat FP where we might not want to • A solution to this is to set the server in CASE RESPECT (because it is faster) • Then use an ETL tool to rtrim() and ucase() or lcase() all of the incoming character data

  19. CASE - 2 • The HG index stores data in what is called “conditioned” mode. • For a CASE IGNORE database there is only one entry per logical value • ABC = abc = Abc etc. • For a CASE RESPECT database there has to be one entry per value • ABC != abc != Abc etc.

  20. CASE – 3 • For an LF index we hold partially conditioned values • For CASE RESPECT and CASE IGNORE all values have a bit-map • This can be wasteful on space • The reason for having this is two fold • To allow for the recreation of the FP index from the LF • To allow for some rare cases (some group by’s) where we still project values from an LF index

  21. COLLATION • COLLATION • Set to ISO_BINENG, this is the fastest • If you must have a collation sequence this will slow the system down by around 10% for 8 bit character sets and 50% for multi-byte character sets • There are substantial slow downs for all multi-byte character sets

  22. IQ PAGE SIZE - 1 • This is an area of extreme contention • The IQ PAGE size (effectively) determines the the size of the smallest addressable area in memory • NOT ON DISK ! • The disk parameter is the BLOCK SIZE

  23. IQ PAGE SIZE – 2 • The rules for IQ PAGE SIZE (or memory buffer size) are simple • Set to 64K unless…. • Set to 128K when the memory model exceeds around 1 Gbytes per cache, and the number of rows in the FACT table exceeds 100m • Set to 256K when the memory model exceeds 4 Gbytes per cache (this may be a rare case!) • Do not set to 512K – there is a little bug…

  24. BLOCK SIZE • The block size is set automatically when the IQ PAGE SIZE is set • (In 11.x IQ you could set the Max Compression parameter that would vary the number of blocks per page) • In IQ 12, you can vary the BLOCK SIZE but it does not do much except in some extreme cases

  25. So what is BLOCK SIZE? Memory Cache Disk Device A Page When a “page” is written out of memory it is compressed. Only the resulting used “blocks” within the page are written to disk, this set of blocks is called a “chunk” A Chunk (A variable length object)

  26. Writing to Disk Default Page in memory 64Kb – 16 Blocks 10 Blocks used before compression Compress before Disk Write Operation Compressor 4 blocks used after compression Disk Write 1 Chunk on Disk the size of 4 blocks - 16kbytes

  27. Disk_Striping_Packed • The problem with large systems is that we only have 1 freelist • This means that if we want 8 blocks of space we will grab the first > 8 space we can find, which tends to fragment the devices • If Disk_Striping_Packed is ON then we have one freelist for each “number of blocks” available • 1 for 1 block free • 1 for 2 blocks free • Etc. • The trade-off is better space usage against slightly worse localisation.

  28. BLOCKS • So as can be seen the IO operation is performed in blocks • but how many are written at one time depends upon • The usage (how much of the page is used) • The Compressibility (how well the used blocks compress) • We have found the default works the best • That is for a 64K page 16 blocks of 4k

  29. Block Size • The compressor is designed to run in 2Kbyte chunks so • Never make the block size less than 2Kbyte • Always make is a multiple of 2Kbyte • The compressor does not have a huge amount of memory to run in so • If the Page size is big (>128Kbyte) do not set the block size much bigger than 32-64Kbyte

  30. Scoping - 1 • ASIQ-M 12.4.2 used the ASA scoping rules, not the ANSI rules • In ANSI there are strict rules as to where aliases can be used, and what they are • ASA has rules that an extension to the ANSI rules and allow much more freedom • In ASA you can re-use and rename anything, anywhere

  31. Scoping - 2 • An example: Select alpha = 27, beta, gamma From tablename Where alpha between 27 and 29 • In ASE rows are returned that match the where clause i.e. alpha between 27 and 29 • In ASA/ASIQ all table rows are returned • Why ?

  32. Scoping – 3 Select alpha = 27, beta, gamma From tablename Where alpha between 27 and 29 • In ASIQ the following happens • In the select we have defined an alias alpha and assigned a value to it (27) • Then in the where clause we check the value of alpha against the values 27 thru 29 and the condition is true because alpha is 27 • Note:Alpha is defined as an alias here not a column identifier

  33. A Picture ASE In ASE this alpha is a alias Select alpha = 27, beta, gammaFrom tablenameWhere alpha between 27 and 29 In ASE this alpha is a column name In ASIQ this alpha is a alias In ASIQ this alpha is also a alias, because it has been defined ASIQ To make the code “correct” for ASIQ you need to prepend the second alpha with The table name. E.g. tablename.alpha

  34. Building Things - End

More Related