cis 402 file management techniques chapter 1 n.
Skip this Video
Loading SlideShow in 5 Seconds..
CIS 402: File Management Techniques Chapter 1 PowerPoint Presentation
Download Presentation
CIS 402: File Management Techniques Chapter 1

Loading in 2 Seconds...

play fullscreen
1 / 14

CIS 402: File Management Techniques Chapter 1 - PowerPoint PPT Presentation

  • Uploaded on

CIS 402: File Management Techniques Chapter 1. Introduction to the Design and Specification of File Structures. Outline. What are File Structures? Why Study File Structure Design? Overview of File Structure Design. Definitions .

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'CIS 402: File Management Techniques Chapter 1' - paul

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
cis 402 file management techniques chapter 1

CIS 402: File Management TechniquesChapter 1

Introduction to the Design and Specification of File Structures

  • What are File Structures?
  • Why Study File Structure Design?
  • Overview of File Structure Design
  • A File Structure is a combination of representations for data in files and of operations for accessing the data.
  • A File Structure allows applications to read, write and modify data. It might also support finding the data that matches some search criteria or reading through the data in some particular order.
data storage
Data Storage
  • Computer Data can be stored in three kinds of locations:
    • Primary Storage ==> Memory [Computer Memory]
    • Secondary Storage [Online Disk/ Tape/ CDRom that can be accessed by the computer]
    • Tertiary Storage ==> Archival Data [Offline Disk/Tape/ CDRom not directly available to the computer.]



memory versus secondary storage
Memory versus Secondary Storage
  • Secondary storage such as disks can pack thousands of megabytes in a small physical location.
  • Computer Memory (RAM) is limited.
  • However, relative to Memory, access to secondary storage is extremely slow [E.g., getting information from slow RAM takes 120. 10-9 seconds(= 120 nanoseconds) while getting information from Disk takes 30. 10-3 seconds (= 30 milliseconds)]
secondary storage access time
Secondary Storage Access Time

Improving the File Structure.

  • The details of the representation of the data and the implementation of the operations on the data determine the efficiency of the file structure for particular applications
  • Using innovation with these details can help improve secondary storage access time.
file structure design general goals
File Structure DesignGeneral Goals
  • Get requested information with one disk access, if possible
  • Otherwise, get the information with minimal accesses.
  • Group information in the most advantageous manner possible
file structure design fixed versus dynamic files
File Structure DesignFixed versus Dynamic Files
  • If a file is static, file structuring is elementary; location of data is consistent
  • Dynamic files grow or shrink when information is added and deleted, making structuring of the file much more difficult.
history of file structuring early work
History of File StructuringEarly Work
  • Assumed that files were on tape.
  • Access was sequential and the cost of access grew in direct proportion to the size of the file.
history of file structuring disks and indexes
History of File Structuring Disks and Indexes
  • As files grew very large, unaided sequential access was not a good solution.
  • Disks allowed for direct access.
  • Indexes made it possible to keep a list of keys and pointers in a small file that could be searched very quickly.
  • With the key and pointer, the user had direct access to the proper area in a large, primary file.
history of file structuring tree structures
History of File Structuring Tree Structures
  • Indexes are also sequential in nature
    • when they get large, they also became difficult to manage.
  • The idea of using tree structures to manage the indices emerged in the early 60’s.
  • Trees are an improvement
    • Problem: they can grow unevenly as records are added and deleted, resulting in long searches requiring many disk accesses to find a record.
history of file structuring balanced trees
History of File StructuringBalanced Trees
  • In 1963, researchers came up with AVL trees for data in memory. (AVL stands for Adel’son-Vel’skii & Landis, the designers of the tree). These trees were height balanced.
  • AVL trees did not apply well to files
    • they work well when tree nodes are composed of single records rather than dozens or hundreds of them.
  • In the 1970’s B-Trees were developed
    • they require an O(logk N) access time
      • N is the number of entries in the file
      • k is the number of entries indexed in a single block of the B-Tree structure
    • B-Trees can guarantee that one can find one file entry among millions of others with only 3 or 4 trips to the disk.
history of file structuring hash tables
History of File StructuringHash Tables
  • Retrieving entries in 3 or 4 accesses represented an improvement with large files
    • doesn’t reach goal of accessing data with a single disk access.
  • Hashing can provide a solution
    • used in lists for quick access to data
    • a good way to achieve single access with files that do not change size greatly over time.
  • Extendible Dynamic Hashing guarantees at most two disk accesses regardless of how large a file becomes.
file structures via c
File Structures via C++
  • Application of proper software engineering
    • Object-oriented approach
    • Separation of Concerns
    • Reuse of code; specifically, classes
  • Obtain implementations that precisely mimic the specifications