1 / 27

VSAM

VSAM. Virtual Storage Access Method Allows for interactive updates (adds, changes, and deletes). Three Types of VSAM files. KSDS Keyed Sequence Data Set RRDS Relative Record Data Set ESDS Entry Sequence Data Set. KSDS. Allows for users to access records sequentially or randomly

silas
Download Presentation

VSAM

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. VSAM • Virtual Storage Access Method • Allows for interactive updates (adds, changes, and deletes)

  2. Three Types of VSAM files • KSDS • Keyed Sequence Data Set • RRDS • Relative Record Data Set • ESDS • Entry Sequence Data Set

  3. KSDS • Allows for users to access records sequentially or randomly • Includes an index and data components • A CLUSTER consists of both the index and data components together • Index relates a value in a key field to the actual location of the record on disk • Index is only used in random (or dynamic) processing

  4. VSAM Data Component Storage Concepts • Control Interval (CI) • Fixed amount of storage; Must be multiple of 512 • Usually 2048 or 4096 bytes • Data records are stored in a CI • A CI of 4096 bytes can store forty 100-byte records • Control Areas (CA) • One cylinder in size • Size of a cylinder varies by disk drive • Contains numerous CIs

  5. Approximate number of records in a CI (CI size - 10) / record length

  6. Records in a CI • Calculate the number of records in a CI formula on previous slide • Determine the percentage of records in a CI to be added or insert in an average CI • Concept of FreeSpace • Space where records can be added “on the fly”

  7. CI in a CA • A fixed number based on CI size and disk drive • For disk drive with ACD001: • CI of 2048: 315 CI in a CA • CI of 4096: 180 CI in a CA • Determine the percent of the CI to be left open for additions • Known as CA freespace

  8. Freespace in a CI • Is used to add records which belong in the CI • Records in a CI are shifted automatically within the CI to accommodate the inserted record

  9. CI Splits • If there is no freespace in the CI in which the record is to be inserted: • CI split • Half of the records in the CI are moved to a free CI in the same CA • The inserted record is then inserted in the proper CI • These happen routinely and are accomplished quickly

  10. CA splits • If a CI needs to split and there is no free CI in the CA: CA split • Half of the CI are moved to a free CA (usually at the end of the file, there is unused space) • Therefore, each of the two CI have 50% free space • The original CI can now split • CA splits have much overhead and should be avoided!!!

  11. Avoid CA splits • Reorganization of files • Import: Copy the VSAM file to a sequential dataset • Export: Delete and reload the VSAM file from the sequential data set: Resets the freespace as in the define cluster • Allocate adequate freespace • Analyze the primary key

  12. Primary Key Analysis • Is the pattern in the PK • Example: The Financial Aid office must keep three years of data on-line. • Previous year: State reporting • Current year: Distributing aid to students • Next year: Granting/guaranteeing aid for next year

  13. Primary Key Analysis • Key options for the financial aid file • 1-digit year + SSN • Little on-line activity on first third of file • Most “adds” are in last third: CI/CS split likely • SSN + 1 digit year • Activity is spread evenly throughout the file • Recommended

  14. VSAM Index Component • Every KSDS as an index component for each primary key AND each foreign key • Base cluster: Primary key index and Data components • Foreign key index is known as the alternate index

  15. Primary Key Index and Data • Two parts of the index: • Sequence set • Index set

  16. Primary Key Index and Data • Sequence Set • Lowest level of the index component • Contains information that relates key values to a specific CI • Links the highest PK in a CI to the address of that CI • Stores all the key-address pairs for the Cis in a CA in one CI of the sequence set • There is a separate CI in the sequence set for each CA in the data

  17. Primary Key Index and Data • Index Set • Highest level of the index component • Key-address pairs stored in one CI (can be stored in main memory for processing efficiency) • Address Pointer links to address of the appropriate sequence set • Based on size of data component (number of cylinders or CA needed to store data component), you may need a intermediate index

  18. Alternate Index and Data Component • Alternate index relates alternate key to primary key • Then uses the primary key index to locate the data • Can have unique or non-unique AI • Figure 2-4 (unique) • Figure 2-5 (non-unique)

  19. Alternate Index • Systems analyst determines whether to update the AI each time the cluster is changed • Can cause much overhead to update all indices especially in the case of a CA split • Other alternative is to re-build the index periodically (every night)

  20. Relative Record Data Set (RRDS) • Lets the user access each record at random without the overhead of maintaining an index • Instead each record in a RRDS is numbered, starting with 1 for the first record • RRDS consists of a specified number of areas or slots • Known as the relative record number (RRN)

  21. RRDS • May need a routine to convert the PK of a record to a relative record number • Hashing • Most common hashing routine: The remainder option on the divide • Can cause empty slots. Can waste storage; But we avoid CI/CA splits • Difficult to HASH if PK is non-numeric

  22. RRDS • Collision • If hashing routine results is same RRN for two different PK • Must set secondary searching technique in case of collisions • Usually linear probing: check the next record up to a maximum number of tries. • Needed to know if record to be added already exists • Needed to determine if record to be retrieved exists without reading entire file

  23. RRDS • Advantages • No index overhead • Direct relationship between data and location of the data • Permits both random and sequential processing • If good hashing routine with minimal collisions: performance efficiency is excellent

  24. RRDS • Disadvantages • Storage efficiency • Collisions • Difficulty in determining good hashing technique • Difficulty with alphabetic key • Does NOT support the concept of FK or AI • Not widely used

  25. ESDS • Entry Sequenced Data Set • The simplest type of VSAM file • Records are stored sequentially at time of entry

  26. ESDS • Similar to sequential processing • Does allow to OPEN EXTEND to add records to the end of the file

  27. ESDS • Author does not recommend it • Says it is restricted to sequential processing • BUT you can build an AI for a FK • BUT as of my current COBOL manuals, COBOL cannot use the AI and processing must be sequential. • This is my current topic for career day questions

More Related