Chapter 7: Working with data sets

Chapter 7: Working with data sets jkettner@us.ibm.com

Chapter 7 objectives • Be able to: • Explain what a data set is • Describe data set naming conventions and record formats • List some access methods for managing data and programs • Explain what catalogs and VTOCs are used for • Be able to create, delete, and modify data sets

block size catalog data set high level qualifier (HLQ) library logical record length (LRECL) member PDS and PDSE record format (RECFM) system managed storage (SMS) virtual storage access method (VSAM) VTOC Key terms in this chapter

Storing Datasets

DASD: Use and terminology • Direct Access Storage Device (DASD) is another name for a disk drive. • DASD volumes are used for storing data and executable programs. • Data sets in a z/OS system are organized on DASD volumes. • A disk drive contains cylinders • Cylinders contain tracks • Tracks contain data records.

What is a data set? • A data set is a collection of logically related data records stored on one disk storage volume or a set of volumes (and TAPE). • A data set can be: • a source program • a library of macros • a file of data records used by a processing program. • You can print a data set or display it on a terminal. The logical record is the basic unit of information used by a program running on z/OS.

How data sets are named • Data set naming convention • Unique name • Maximum 44 characters • Maximum of 22 name segments: level qualifier • The first name in the left: high level qualifier (HLQ) • The last name in the right: low level qualifier (LLQ) • Level qualifiers are separated by '.' • Each level qualifier: • From 1 up to 8 characters • The first must be alphabetical (A-Z) or special (@ # $) • The 7 remaining: alphabetical, national, numeric (0-9) or hyphen (-) • Upper case only • Example: MYID.JCL.FILE2 HLQ: MYID 3 qualifiers • Member name of partitioned data set • 8 bytes long • First byte: alphabetical (A-Z) or special (@ # $) • The 7 remaining: alphabetical, special, numeric (0-9)

Dataset Naming

HFS* * Includes zFS Access Method – What is it? • Defines the technique used to store and retrieve data. • Includes system-provided programs and utilities to define and process data sets. • Commonly used access methods include the following: BDAM BSAM ISAM BPAM VSAM QSAM OAM DIV

Using a data set • To use a data set, you first allocate it. Then, access the data using macros for the access method that you have chosen. • Various ways to allocate a data set: • ISPF data set panel, option 3.2 • Access Method Services • TSO ALLOCATE command • job control language (JCL)

Allocating space on DASD volumes • How space is specified: • explicitly (SPACE parameter) • implicitly (SMS data class) • Logical records and blocks: • Smallest amount of data to be processed • Grouped in physical records named blocks • Data set extents: • Space for a disk data set is assigned in extents

Data Set: how is it stored? iebcopy

F record record record record Fixed records. block block FB record record record record record record Fixed blocked records. BLKSIZE = n * LRECL V record record record Variable records. RDW block block VB record record record record record Variable blocked records. BLKSIZE >= 4 + n * largest LRECL BDW U record record record record Undefined records. No defined internal structure for access method. Record and block descriptors words are each 4 bytes long Data set record formats

Types of data sets • We discuss three types in this class: • Sequential, partitioned, and VSAM • A sequential data set is a collection of records written and read in sequential order from beginning to end. • A partitioned data set (PDS) is a collection of sequential data sets, called members. • Consists of a directory and one or more members. • Also called a library. • A PDSE is a partitioned data set extended.

You create a directory when you create PDS dataset Means of grouping similar Artifacts (code or data) Mixing of dataset types within a volume Different business needs not necessarily related Sequential and Partitioned Datasets One dataset Multiple datasets ani

= PDSE Partitioned Datasets

/ ` PDS versus PDSE • PDS data sets: • Simple and efficient way to organize related groups of sequential files. • PDSE data sets: • Similar to a PDS, but advantages include: • Space reclaimed automatically when a member is deleted • Flexible size • Can be shared • Faster directory searches X DISK Only

Advantages of PDSE • The size of a PDSE directory is flexible and can expand to accommodate the number of members stored in it • (the size of a PDS directory is fixed at allocation time). • * PDSE members are indexed in the directory by member name. This eliminates the need for time-consuming sequential directory searches. • * The logical requirements of the data stored in a PDSE are separated from the physical (storage) requirements of that data, which simplifies data set allocation. • * PDSE automatically reuses space, without needing an IEBCOPY compress. A list of available space is kept in the directory. When a PDSE member is updated or replaced, it is written in the first available space. This is either at the end of the data set, or in a space in the middle of the data set marked for reuse. • For example, by moving or deleting a PDSE member, you free space that is immediately available for the allocation of a new member. This makes PDSEs less susceptible to space-related abends than PDSs. • * This space needs not be contiguous. The objective of the space reuse algorithm is not to extend the data set unnecessarily. • * The number of PDSE members stored in the library can be large or small without concern for performance or space considerations. • * You can open a PDSE member for output or update, without locking the entire data set. • The sharing control is at member level, not the data set level. • * The ability to update a member in place is possible with PDSs and PDSEs. But with PDSEs, you can extend the size of members and the integrity of the library is maintained while simultaneous changes are made to separate members within the library. • * The maximum number of extents of a PDSE is 123; the PDS is limited to 16. • * PDSEs are device-independent because they do not contain information that depends on location or device geometry. • * PDSEs can contain program objects built by the program management binder thatcannot be stored in PDSs.

Generation Dataset (GDG) DSN=MYFILE.ACCOUNT.FILE(+1) DSN=MYFILE.ACCOUNT.FILE(G00V0001) • All of the datasets in the group can be referred to by a common name • The Operating System is able to keep the generations in chronological order

IDCAMS Utility - LISTCAT output of GDG base MHLRES4.TEST.GDG(+1) … MHLRES4.TEST.GDG(+2) during job run MHLRES4.TEST.GDG(0) …. MHLRES4.TEST.GDG(-1) at job conclusion

GDG illustration

JCL Attributes can also be entered thru JCL //DDCARD DD DSN=ROGERS.JCL.TEST,DISP=(,CATLG), // SPACE=(CYL,(5,1,50),,CONTIG), // DCB=(RECFM=FB,LRECL=80,BLKSIZE=27920) Note: Positional parameters Allocating a Dataset in ISPF NOTE: DFHSMS-ACS

Catalogs and VTOCs • z/OS uses a catalog and a volume table of contents (VTOC) on each DASD volume to manage the storage and placement of data sets. • VTOC: • Lists the data sets on a volume • Lists the free space on the volume.

Format 7 DSCB Format 1 DSCB VTOC Cyl. 0 Trk. 0 DSCB (Data Space Control Block) ICKDSF: z/OS utility for creating label and VTOC

Dataset Control Blocks (DSCB)

IBM Utility – IEHLIST VTOC

VTOC Index Structure ISPF option 3.4

First record in every VTOC Volume Table of Contents 140 byte DSCBs  Format type 4

VSAM Volume Data Set

How a catalog is used • A catalog associates a data set with the volume on which the data set is located. • Locating a data set requires: • Data set name • Volume name • Unit (volume device type) • Typical z/OS system includes a master catalog and numerous user catalogs.

Catalog Structure Basic Catalog Structure (BCS) – This is considered the “real” Catalog The BCS is a VSAM KSDS and its primary function is to point to the volumes on which the dataset is located. VSAM Volume Dataset (VVDS) – Can be considered an ‘extension’ of VTOC The VVDS is an ESDS containing information to process the dataset containing volume related information. VTOC

Volume, Security, Ownership, etc. VSAM and non-SMS Managed Datasets SYS1.VVDS.Vvolser Integrated Catalog Structure (ICF) Many to Many Basic Catalog Structure (BCS) – Static information that rarely changes VSAM Volume DataSet (VVDS) - Additional Catalog Information

Where DS resides: Tape, Disk,…other Uses Dataset Names as keys BCS – itself is a VSAM KSDS dataset

IDCAMS - VVDS example • VSAM Utilities ------------------------------------- LINE 00000000 COL 001 080 • COMMAND ===> SCROLL ===> PAGE • ********************************* Top of Data ********************************** • IDCAMS SYSTEM SERVICES TIME: 07:47:41 • /* IDCAMS COMMAND */ • LISTCAT ENTRIES(SYS1.VVDS.VDMPU03) • CLUSTER ------- SYS1.VVDS.VDMPU03 • IN-CAT --- CATALOG.MASTER.MCAT • DATA ------- SYS1.VVDS.VDMPU03 • IN-CAT --- CATALOG.MASTER.MCAT • IDCAMS SYSTEM SERVICES TIME: 07:47:41 • THE NUMBER OF ENTRIES PROCESSED WAS: • AIX -------------------0 • ALIAS -----------------0 • CLUSTER ---------------1 • DATA ------------------1 • GDG -------------------0 • INDEX -----------------0 • NONVSAM ---------------0 • PAGESPACE -------------0 • PATH ------------------0 • SPACE -----------------0 • USERCATALOG -----------0 • TAPELIBRARY -----------0 • TAPEVOLUME ------------0 • TOTAL -----------------2 • THE NUMBER OF PROTECTED ENTRIES SUPPRESSED WAS 0 • IDC0001I FUNCTION COMPLETED, HIGHEST CONDITION CODE WAS 0 } ESDS: VSAM Seq file

Catalog Path Structure DEFINE ALIAS (NAME (IBMUSER) RELATE (USERCAT.IBM ) )

User Catalog Alias’ Good practice: different applications in different UCATs

DS Locating a dataset in MVS Oh…its Catalogued nevermind Where is that #$%@ Dataset GREAT !

Catalog and Uncataloged Datasets Note the ‘ // ‘ and parm statements used for Job Control Language

Tape datasets are limited to 255 volumes Max Dataset Size • Some types of datasets are limited to • 65,535 total tracks allocated on any • one volume • Exceptions: • H F S (165.2 GB) • P D S E (123 Extents) • VSAM(4GB limit)* Special cases – Extended format data sets with multiple strips are limited to 16 volumes * Using Extendable Format = 128 TB

Large Volume (own device type)

z/OS Access Method VSAM – not just a data set type • VSAM is Virtual Storage Access Method • VSAM provides more complex functions than other disk access methods • VSAM record formats: • Key Sequence Data Set (KSDS) • Entry Sequence Data Set (ESDS) • Relative Record Data Set (RRDS) • Linear Data Set (LDS)

MVS Datasets versus Unix Files

Data management in z/OS • Data management involves all of the following tasks: • allocation, placement, monitoring, migration, backup, recall, recovery, and deletion. • Storage management is done either manually or through automated processes (or through a combination or both). • In z/OS, DFSMS is used to automate storage management for data sets.

Data Facility Subsystem Managed Storage (DFSMS) Rules based

Gold customer High Priority Business Partner High Priority Customer care High Priority (Business Hours) Casual customer Low priorty Data Analysis (Best can do) DFSMS (System Managed Storage) Policy Management

Storage Management Data Used to control: retention migration backup release Used to control: performance goals availability Used to control: Allocation Space Automatic Class Selection

Gold customer High Priority Business Partner High Priority Rule #1 Tape Tape Tape Tape Rule #4 Secondary Storage Secondary Storage Third Storage Media Data Analysis (Best can do) Secondary Storage Rule Based Policy Management to Manage Backup/Restore Automatically via Hierarchical Storage CritSit ! Rule #2

Interactive Storage Management Facility (ISMF) Storage Management is performed interactively via ISPF panels

z/OS Access Method VSAM – not just a data set type • VSAM is Virtual Storage Access Method • VSAM provides more complex functions than other disk access methods • VSAM record formats: • Key Sequence Data Set (KSDS) • Entry Sequence Data Set (ESDS) • Relative Record Data Set (RRDS) • Linear Data Set (LDS)

VSAM Access Method

Chapter 7: Working with data sets