1 / 29

Jack Parker Arten Technology Group

Jack Parker Arten Technology Group. Calisthenics. How many here are Informix users? How many here are HPL users?. Why use HPL. It’s Fast It’s Flexible. Previous (and still extant) Loaders. LOAD 1GB/Hr. 2GB Limit. Long Transaction heaven. dbload 1GB/Hr. 2GB Limit. dbimport

kylene
Download Presentation

Jack Parker Arten Technology Group

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Jack Parker Arten Technology Group

  2. Calisthenics • How many here are Informix users? • How many here are HPL users?

  3. Why use HPL • It’s Fast • It’s Flexible

  4. Previous (and still extant) Loaders • LOAD • 1GB/Hr. 2GB Limit. Long Transaction heaven. • dbload • 1GB/Hr. 2GB Limit. • dbimport • 1GB/Hr. 2GB Limit. Entire database • Insert Cursor • 2GB/Hr. 2GB Limit. Complex. • High Performance Loader – Faster. More flexible. • See Load FAQ at www.artentech.com/downloads.htm

  5. Basic Load Rate (1 CPU)

  6. Multiprocessor Load Rate

  7. Why is it faster? • Other loaders • Single threaded • Use Buffers (the Wall) • HPL • Multi-threaded • Avoids Buffers • Light Scans / Light Appends

  8. Light Scans and Light Appends • Light Scans use their own buffer pool and avoid the normal buffers • Light Appends write to their own disk and append the results if successful • Light scans occur when: • Table is bigger than buffers • Dirty Read • No Varchars

  9. In reality… • HPL performance is dependent not only on CPU, but on disk I/O • The more disk you can put into the equation, the faster it will run. • My tests were run on a • Solaris 2 CPU machine • 1 disk • 31MB of RAM for Informix

  10. More disk • Spread input/output files across filesystems (create three or four large data areas just for this). • Fragment tables across multiple dbspaces. (up to 3 per CPU, but at least 100MB/dbspace) • Size tables properly

  11. Unload Metrics • Simple unload 2.4 GB/CPU/Hr • Wait a minute • Update statistics • Simple unload 4.9 GB/CPU/Hr

  12. Flexible • Load or Unload • Disk • Tape • Pipe

  13. This means that you can… • Load straight from (or to) disk • Prepare data to a pipe and Load • Unload to gzip (pipe) • Load from gzip (pipe) • Unload to another HPL load process • Or Pload • Even on another machine

  14. Unload to Gzip Load from Gzip Internal format to Gzip Internal from Gzip HPL “Insert into table2 select * from table1” (vs. normal insert/select) 6 GB/CPU/HR 3 GB/CPU/HR 11.5 GB/CPU/HR 7.5 GB/CPU/HR 4.5 GB/CPU/HR .7 GB/CPU/HR A quick look at some of those:

  15. You can use HPL for all sorts of things it wasn’t intended for • As a backup tool • As a reorg tool • As a data movement/migration tool • As an ETL tool • For more thoughts on this check out Raj Murali’s talk on ETL within the database (XPS) D31 • also see the IBM Developer zone DSS Application Processing http://www7b.boulder.ibm.com/dmdd/zones/informix/library/techarticle/parker/0502parker.html

  16. Flexible on the inside too. • A “Natural” interface • ipload to start • $DISPLAY • xhost • INFORMIXSERVER=tcp connection (for some)

  17. Project Contents

  18. Load Main Screen

  19. Device Specification

  20. Format Specification

  21. Filter Specification

  22. MAP

  23. Options

  24. Unload Main Screen

  25. Query Specification

  26. Unload Device

  27. Generate Job

  28. Other • $DBONPLOAD • $PLCONFIG • Create your own functions and link them in • onpladm

  29. Summary The High Performance Loader is a powerful tool which can be used wherever you need to move a lot of data around efficiently. With this tool you can replace a lot of old and slow code and realize significant performance improvements. Best of all? It’s already on your machine.

More Related