Topics Covered: File Organization Techniques of file organization • Serial file organization • Sequential file organization • Direct file organization • Indexed sequential file organization
File Organization: • File organization refers to the relation ship of the key of the record to the physical location of that record in the computer file. • Two important characteristics of files are Data Organization and Method of Access. • Data organization, refers to the way the records of the file are organized on the backing storage device. • Method of access, refers to the way in which records are accessed. Some organizations are more versatile than others. A file with an organization of Indexed or Relative may still have its records accessed sequentially; but records in a file with an organization of Sequential, cannot be accessed directly.
Techniques of File Organization: • Serial - Records organized serially in any order. • Sequential - Records organized serially in ascending or descending order. • Direct - Relative record number based organization. • Indexed - Index based organization. • Serial Organization • Creation of file -> In a Sequential file the records are arranged one after another. There is no relationship between key field values of consecutive records. In other words records can be stored in an order.
Access -> Only way to access records in a serial file, is linear search. i.e. key field value to be located is compared with key field value of each record starting at the first record and read all the succeeding records until the required record is found or until the end of the file is reached. • Insertion -> Since no space is left between the records at the time of creation, insertion can be performed at the end of the file. However to insert a record in between method of rewriting can be used. • Deletion -> To delete any record first it is located then the record is marked for deletion. • Media Used -> Serial organization may be implemented on magnetic tape or on hard disk. i.e. on serial access device as well as direct access device.
Advantages-> • Easy to use • Maximum utilization of space. • Algorithms are easy to implement. • Less storage cost • Disadvantages-> • Slow organization because only linear access is possible. • It is not really feasible to delete and update records in a serial file.
Sequential Organization • Creation of file -> In a Sequential file the records are arranged one after another. In ascending or descending order of key field value. • Access -> Records in a sequential file can be accessed either by using linear search in which all the records are searched starting at the first record and read all the succeeding records until the required record is found or until the end of the file is reached. The alternative method is skip search in which given key field is compared with key field of record after skipping a fixed number of records. • Insertion -> Insertion can be performed either at the end of the file or by rewriting the file. But in any case the sequence must be maintained.
Deletion -> To delete any record first it is located either sequentially or by skip search and then the record is marked for deletion. • Media Used -> Sequential organization may be implemented on magnetic tape or on hard disk. i.e. on serial access device as well as direct access device. • Advantages-> • Fast access as compared to serial organization because skip search can be used. • Maximum utilization of space. • Algorithms are easy to implement. • Disadvantages-> • Sequence of records is to be maintained for which extra time & efforts are required.
Direct Organization (Random or Relative) • Creation of file -> In direct file organization the records are placed randomly on backing storage device without any sequence of key field value. Key field value at given record is converted to address on backing storage device and then given record is stored at calculated address. • Access -> Any record can be accessed from its storage location or address. The previous records need not to be accessed here i.e all records can be retrieved independently. • Insertion -> Insertion can be performed randomly at any position by calculating the address.
Deletion -> Deletion can be performed randomly from any position by calculating the address. • Media Used -> Direct organization can be created only on direct-access storage device like magnetic disk. • Advantages-> • Immediate access to records for updating is available. • Transactions need not be sorted. • Random inquiries which are too frequent in business can be handled easily. • Updating of any record does not require the rewriting of the entire file. • Direct file organization is suitable for interactive online applications such as airlines or railway reservation or banking applications.
Indexed sequential organization • Creation of file -> In this organization the records are organized in sequence but direct access is possible to individual records through an index. Here storage area is divided into three parts one is prime area, second is overflow area, third is indexed area. Prime area: It covers the backing storage device. In prime area records are placed in sequential order i.e. ascending or descending order of key field value. Records are written in prime area when the file is reorganized or created. Overflow area: It is one in which records are stored when prime area is full. Indexed area: It is used to store index of the file. Index contains track number and highest key field value on that track.
e.g. if we have a storage device with 5 tracks and 5 records can be stored on each track then indexed organization can be indicated as follows: 1 2 3 4 5 Track Prime areaIndex area Overflow area
Access -> Records can be accessed in linear form in which all the records are searched starting at the first record and read all the succeeding records until the required record is found or until the end of the file is reached. In second method given key field value is compared with index to obtain track number for which highest key field value is greater than or equal to given key field value then that particular track is searched sequentially to obtain the record. • Insertion -> Insertion can be performed either at the end of the file or insertion can be performed in between by shifting the records within the file. During insertion if prime area is full records are stored in overflow area. e.g. in the above example if record with key field value 64 is to be inserted then 67 (on track no. 3)will be shifted to next position and 72 will be shifted to overflow area.
1 2 3 4 5 Track Prime areaIndex area Overflow area
Deletion -> For deletion first the record is searched using indexed sequential method and then marked for deletion. • Media Used -> Indexed sequential organization can be created only on direct-access storage device like magnetic disk. • Advantages-> • This process is faster as compared to other organization. • It combines positive aspects of both sequential and direct access files. • Disadvantages-> • Wastage of space in creating index. • Slow retrieval as compared to direct access as searching of index requires time.