1 / 93

Database Design for Success

Database Design for Success. Informix 101. #include <disclaimer.h>. Agenda. Sizing your machine Install and Configure the Engine Build a database Loading Data Monitoring, Tuning & Ongoing Administration Odds, Ends and other Pet Peeves. Sizing Your Machine. The Data Model

mackenzie
Download Presentation

Database Design for Success

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Database Design for Success Informix 101

  2. #include <disclaimer.h>

  3. Agenda • Sizing your machine • Install and Configure the Engine • Build a database • Loading Data • Monitoring, Tuning & Ongoing Administration • Odds, Ends and other Pet Peeves

  4. Sizing Your Machine • The Data Model • The holy trinity • Disk, CPU, Memory • Raid, Mirroring, Striping

  5. No plan of battle survives it’s first contact with the enemy. - Clausewitz

  6. No System Design survives it’s first contact with the data. - Parker That’s no excuse for not making one.

  7. The Logical Model (LDM) • An informal representation of data and it’s relationships. • Format of the model can be changed to fit the users understandings. • Used to communicate with users. • Is NOT suited to be the foundation of a database. • Entities, attributes

  8. The Physical Data Model (PDM) • A formal description of a database and the links between tables. • Very rigid format. • Points out mistakes. • Can provide a sizing estimate. • Table, Columns. • A roadmap for new team members

  9. Input: table, rowsize, nrows A Sizing Tool Outputs: extent size, estimated scan time

  10. Sizing is not always possible • Have to order the machine before we know the full extent of the data. • Multiply the raw data size * 4 = disk • Divide by 25GB = CPU • round up to multiples of 4 • Multiply * 500MB = Memory • Make sure you can grow from there • Make sure to indicate this is an estimate.

  11. To Raid or not to Raid(a battle you will lose) • Expensive in terms of performance. Your machine is doing extra work – depending on the Raid you choose. • Mirroring is best, but expensive in terms of $. • 500 disks, 50,000 MTBF = 1 failure every 100 hours.

  12. Striping • Takes control out of the hands of the DBA and gives it to the sysadmin – who doesn’t have a clue. • Hard to say no to. • Mitigate it by striping small disk sets (groups of 4 or 8). • Worse comes to worse, benchmark.

  13. You now have • A machine. • A bunch of raw disks that have been sliced up. • Some expectation of table and index size. • A Physical Data Model expressed in terms of a schema.

  14. Setup • Getting the engine online • Layout of disks • root • logical logs • physical logs • data • temp

  15. Install Informix • Use $INFORMIXDIR as a link, not a hard coded path. • Run as root • Everybody knows this, right?

  16. Disks • mkdir /opt/server_1/dev (or somesuch) • ln –s all disks to this directory • Give them names that match underlying disks. • Be VERY careful not to overlay one link on top of another. (keep a map) Recovery note: • use a script to do the links. Save it. • $INFORMIXDIR/server_1/rebuild

  17. Raw vs Cooked Disk • Cooked disk is a file • Uses Unix file system I/O • Risk • Raw disk is a character special device (/dev/rdsk/c0d3s2) • Uses Kernel I/O (KIO) or Asynchronous I/O (AIO) • Engine does the work • 15% faster

  18. Disk Example • Raw Device • /dev/rdsk/c1d0s3 • Link to raw device • /opt/server_1/dev/c1d0s3 -> /dev/rdsk/c1d0s3 • onstat –d • dbsp1.1 /opt/server_1/dev/c1d0s3 • dbsp1.2 /opt/server_1/dev/c2d0s3 • dbsp1.3 /opt/server_1/dev/c3d0s3

  19. oninit -ivy • Cheat and use onmonitor • $ONCONFIG • $INFORMIXDIR/etc/sqlhosts • /etc/services • Don’t forget: $INFORMIXDIR, $INFORMIXSERVER, $TERM, $ONCONFIG, ($SQLHOSTS)

  20. The engine trinity • $ONCONFIG • DBSERVERNAME, ALIAS • SQLHOSTS • SERVERNAME, SERVICE • /etc/services • Service • $INFORMIXSERVER

  21. Move logs • Logical and Physical logs can be high activity. Good practice to move them to their own disks. • Physical • Change name of PHYSDBS in $ONCONFIG • Logical • Add new logs to new DBS • Force to next log (onmode –l) • Drop original logs (onparams –d –l log_id)

  22. Temp Disks • More is better • 2 or 3 per cpu • Many shallow is better than few deep. • Used to sort, build indices, overflows. • Size to largest index, or largest hash join.

  23. What is Parallelism? Sort vertical parallelism horizontal parallelism Join Time to Process Scan ParallelTasks Parallel Subtasks Tasks

  24. How do you turn it on? • PDQPRIORITY • Fragment your data

  25. PDQPRIORITY • Turns on parallelism (=1) • Allocates Decision Support (DS) Memory • $PDQPRIORITY • SET PDQPRIORITY • Memory = DS_Memory * (MAX_PDQPRIORITY/100) * (PDQPRIORITY/100) • 1GB * 50 * 50 /10000 = 250MB

  26. Fragment tables

  27. Fragmentation • Create table profile (……..) • Fragment by • round robin in dbslc1 • expression (col1 > abc in dbslc1.1, col1 < ghi and col1 > def in dbslc 1.2 • hash(col1) in dbslc1 • hybrid (col1) expression col2 > abc in dbslc1.1, col2 < ghi and col2 > def in dbslc1.2

  28. Fragment by hash at work

  29. Data Disks • Wide is better than deep • Break into logical groupings of dbspaces • Multiples of CPU’s – to a max of 3. • DO NOT name according to tables. • Wide.1…Wide.32 • Thin.1 … Thin.4 • Temp.1 … Temp.16 • (XPS DBslice)

  30. Table layout • Size first and next extents properly • Spread across multiple disks • Across multiple controllers • Balance Fragments to CPUs

  31. Balance • Horizontally • Even distributions of data across disks • Disks matched to CPUs (data and temp) • Vertically • Spread the load of checkpoints across Page cleaners and checkpoint writes

  32. Loading data • Check out the loader.FAQ www.artentech.com/downloads.htm • Load • DBload • Insert Cursor • HPL • PLoader

  33. Update Statistics • What does it do? • systables, sysindexes, • sysdistrib • Why do it? • How to make it faster • DBUPSPACE • Run in parallel • with LOCK MODE WAIT • Stored Procedures • Scripts at www.iiug.org

  34. Building Indices • Why build an index? • Find specific rows • Resolve query with index only scans • Promote nested loop joins • Bitmaps and Generalized Key indices • Fill Factor, Clustered • Involves: • Scan, Sort, btree-appender threads • Write to disk

  35. Attached, Detached, Fragmented • Detached • specify storage for the index • Otherwise uses the tablespace • Default in 7 is attached • Default in 9 is detached, 8 is detached • Detached means another 4 bytes • Why detach?

  36. Monitoring, Tuning and Ongoing Administration

  37. Things to tune • CPU • Memory • Disk • Network / Interconnect • Data Model • Processing

  38. Things to tune The Engine 2x - 3x

  39. Things to tune Disk 2x - 100x The Engine 2x - 3x

  40. Things to tune Code 10x – 1000x Disk Engine

  41. Things to tune Data Model Code Disk Engine 10x – 10000x

  42. Tuning • Big knobs • Disk • Disk • Disk • Memory • Little knobs • The rest of $ONCONFIG

  43. MonitorngSo many utilities… • onstat • sysmaster • OS utilities (iostat,mpstat) • Graphical (onperf, xtree) • Specific (ISpy) • Third Party (Patrol, Big Brother, etc) • Query Plan • Home grown

  44. … It all boils down to • What do YOU care about. • Look for problems, and monitor events leading up to those.

  45. Some things I watch • Disk fragmentation, size, up • onstat –p • Did expected event take place? • Long checkpoints • Logfiles • errors in online.log • Resource hogs

  46. onstat –pdipstick & thermometer rolled into one • Cache, waits, compresses, rollbacks • Negative values • Capture them periodically and reset • if you care that much

  47. Disk • Are table fragmented/sized properly • full scan of table – how long? • fragment elimination • < 4 extents per dbspace • Has someone dropped a new table somewhere? • Good indices? • Look at query plans

  48. Different disk utilities report different things • Raw Data Size • From Unix 32,000,000,000 chars • Pages allocated • From oncheck –pe 10,000,000 pages • Created table as • First Size 100,000 kbytes • Pages used • From npused 7,928,642 pages

  49. onchk.awk Chunk: rootdbs  Size: 19.53  Used: 13.54  Ckfree:  5.99 PctFree: 30.69sysutils            0.62    Extents  39 *****LOGICAL           9.77    Extents  20 *****stores7             0.67    Extents  40 *****PHYSICAL       0.98    Extents   1sysmaster        1.17    Extents  73 ***** Chunk: testdbs  Size: 9.77  Used:  0.12    Ckfree:  9.65 PctFree: 98.78                                                *** OVER 80% FREE SPACE ***stores7                    0.02    Extents   1 Grand total disk use (MB)     27.19 See Handout

  50. Memory • Is there enough? • What does overflow look like? • Is there enough temp space to cover it? • How are people joining tables?

More Related