1 / 28

To Compress or not to Compress?

To Compress or not to Compress? . Chuck Hopf. What is your precious?. Gollum says every data center has something that is precious or hard to come by CPU Time DASD Space Run Time IO Memory. Lots of talk. On the LISTSERVE – does compression use more CPU? Does it save DASD space?

elu
Download Presentation

To Compress or not to Compress?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. To Compress or not to Compress? Chuck Hopf

  2. What is your precious? • Gollum says every data center has something that is precious or hard to come by • CPU Time • DASD Space • Run Time • IO • Memory

  3. Lots of talk • On the LISTSERVE – does compression use more CPU? Does it save DASD space? • On the LISTSERVE – what is the best BUFNO= to use with MXG

  4. Testing the theories • Built two tests • COMPRESS=NO varying BUFNO from 2 10 15 20 • COMPRESS=YES again varying the BUFNO

  5. An Epiphany! • What if you run with COMPRESS=NO and send the output to PDB as a temporary dataset and then at the end, turn on COMPRESS=YES and do a PROC COPY INDD=PDB OUTDD=PERMPDB NOCLONE; ? That would eliminate all of the compression during the reading and writing of all of the interim datasets but still create a compressed PDB.

  6. So there are now 3 Tests! • TEST=NO - COMPRESS=NO • TEST=NO/YES - COMPRESS=NO but final PDB is compressed • TEST=YES – COMPRESS=YES

  7. CPU Time

  8. Elapsed Time

  9. Low Memory

  10. High Memory

  11. EXCP DASD

  12. DASD IO Time

  13. DASD Space

  14. DASD Space by DDNAME

  15. Conclusions? • Running with COMPRESS=NO and then copying to a compressed PDB optimizes permanent DASD space and uses very little additional CPU. • Even better, use the LIBNAME OPTION to turn it on where you want: • LIBNAME PDB COMPRESS=YES; /* zOS only */ • Memory requirements increase with BUFNO but are not really that bad and BUFNO GT 10 shows very little additional benefit

  16. Caveats! • BLKSIZE matters. SAS procs are sometimes built with a BLKSIZE of 6160 on WORK. This radically affects the IO counts. Use the recommended BLKSIZE=DASD(OPT) and leave the DCB attributes off of SAS datasets. • REGION may have to be increased – use REGION=0M and be sure you are using the MXG defaults for MEMSIZE. • This all applies to zOS not to ASCII platforms

  17. So What About ASCII? • Using the same data, tests run with SAS 9.2 on Win 7 system • 1.5GB memory • Dell 4600 – P4 2.7GHz

  18. ASCII Results

  19. Wow! • COMPRESS=YES outperforms COMPRESS=NO! • BUFNO makes some difference but not a lot and BUFNO=10 looks to be optimal • Difference is in seconds not minutes • But… there is something we don’t understand in the memory numbers • Runs faster under Win 7 than under zOS • But does not include download time

  20. So What Should You Do? • It Depends on what your ‘precious’ is • Running zOS • Optimal for CPU and DASD is COMPRESS=NO with a copy to a compressed dataset at the end or by setting the compress=YES option with a LIBNAME • Optimal for CPU is COMPRESS=NO • Optimal for DASD is COMPRESS=YES • BUFNO=10 is optimal for run time • Running ASCII • Optimal for CPU and DASD is COMPRESS=YES

  21. JCL //* SAMPLE JCL TO RUN BUILDPDB WITH COMPRESS=NO AND COMPRESS AT //* THE END USING PROC COPY //S1 EXEC MXGSASV9 //PDB DD DSN=MXG.PDB(+1),SPACE=(CYL,(500,500)), // DISP=(,CATLG,DELETE) //SPININ DD DSN=MXG.SPIN(0),SPACE=(CYL,(500,500)) // DISP=(,CATLG,DELETE) //SPIN DD DSN=MXG.SPIN(+1),DISP=OLD //CICSTRAN DD DSN=MXG.CICSTRAN(+1),SPACE=(CYL,(500,500)), // DISP=(,CATLG,DELETE) //DB2ACCT DD DSN=MXG.DB2ACCT(+1),SPACE=(CYL,(500,500)), // DISP=(,CATLG,DELETE) //SMF DD DSN=YOUR,SMF DATA,DISP=SHR //SYSIN DD * OPTIONS COMPRESS=NO BUFNO=10; LIBNAME PDB COMPRESS=YES; LIBNAME SPIN COMPRESS=YES; %LET SPININ=SPININ; %UTILBLDP( MACKEEPX= MACRO _LDB2ACC DB2ACCT.DB2ACCT % MACRO _KDB2ACC COMPRESS=YES % MACRO _KCICTRN COMPRESS=YES % , SPINCNT=7, SPINUOW=2, OUTFILE=INSTREAM); %INCLUDE INSTREAM; JCL is in the 27.10 SOURCLIB as JCLCMPDB

  22. Why UTILBLDP? • Allows you to add data sources to BUILDPDB without having to edit the macros in the SOURCLIB. • Allows you to suppress data sources like 110 and DB2 and TYPE74 and process them in other jobs again without editing the macros. • Flexibility

  23. Example OPTIONS COMPRESS=NO BUFNO=10; LIBNAME PDB COMPRESS=YES; LIBNAME SPIN COMPRESS=YES; %LET SPININ=SPININ; %UTILBLDP( USERADD=42, SUPPRESS=110 DB2, SPINCNT=7, OUTFILE=INSTREAM); %INCLUDE INSTREAM; RUN;

  24. MXG User Experience • Running MXG with WPS instead of SAS • Data from multiple platforms • Processed under two Virtual products • Also, Comparison of SAS/PC and WPS on zLinux

  25. PC/SAS VMWARE/Windows versus PC/SAS Hyper-V/Windows: (four platform’s data, three installation “groups” PROD/QA/DEV) Data From VMWARE(PROD) Hyper-V(PROD) Unix 00:05:30 00:10:56 zOS 00:01:30 00:04:54 zVM/Linux 00:03:07 00:08:08 Windows Servers 02:43:08 09:32:57 Data From VMWARE(QA) Hyper-V(QA) Unix 00:00:31 00:04:18 ZOS 00:01:27 00:02:46 zVM/Linux 00:01:02 00:07:06 Windows Servers 00:41:24 02:34:19 Data From VMWARE(DEV) Hyper-V(DEV) Unix 00:00:43 00:02:42 ZOS 00:00:21 00:01:42 zVM/Linux 00:01:08 00:03:34 Windows Servers 00:09:06 00:38:47 Processing of performance Data collected from Unix, zVM/Linux, zOS and Windows.

  26. PC/SAS versus LNX/WPS • PC/SAS VMWARE/Windows versus WPS zVM/Linux • PC/SAS VMWARE is taking 2:43:08 to process the data from “Window Servers” for what the WPS zVM/Linux environment can do in 1:30:00 (hh:mm:ss).  • That is, the Mainframe WPS zVM/Linux is a 45% improvement over the PC/SAS VMWARE/WIN. • This is most likely due to the extra bandwidth the mainframe has for I/O’s compared to the Windows environment.  • The results for Windows would probably be better if WIN2008 had been used.

  27. PC/SAS versus WPS on z • PC/SAS under Hyper-V • WPS under zVM/Linux on z-10

  28. Z10: SAS versus WPS • zOS/SAS versus zOS/WPS to run MXG • 30% more I/O’s for SAS • TCB for WPS = 551,423 • TCB for SAS = 551,273 • NOTES: • WPS version 2.4.0.1 and SAS 9.1.3 • MXG from FEB 2009

More Related