1 / 23

Data File Hierarchy/Terminology

Data File Hierarchy/Terminology. Folder or Directory A container for 0 or more files accessible from the same location. File A collection of related records stored as a unit on external media. Record or Row

thuyet
Download Presentation

Data File Hierarchy/Terminology

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data File Hierarchy/Terminology • Folder or Directory • A container for 0 or more files accessible from the same location. • File • A collection of related records stored as a unit on external media. • Record or Row • A collection of related fields such as many pieces of information about a person or thing. • Field • A single (meaningful) piece of information about a person or thing. • Byte • Normally a collection of one or more characters that comprise a field (character ~ byte) • Bit • Commonly written as 0 or 1 as the smallest unit of computer data. • Generally 8 bits = 1 byte, yielding 256 combinations--ASCII code

  2. Data File Hierarchy Example • Folder or Directory • For class files called Courses • File • The file CPT250Grades.txt inside the Courses folder holds all class information for one course. • Record or Row • Each record in the grades file contains Student Name, Student ID, three test grades and ten project grades. • Field • Student ID is comprised of the last 5 digits of the social security number. • Byte • Each character of the ID requires one byte of storage. • Bit • The character “2” of the student ID has the ASCII code 50, which is written as 00110010, where each symbol is a bit.

  3. Physical Details of Files • Before you can write programs that use files, you need to get answers to the following: • Can records be accessed sequentially? • Can records be accessed arbitrarily? • Do all of the records have the same length? • How are the fields ordered on each record? • What is the data type of each field? • Is the length of each field the same? • How are the fields separated?

  4. Different Types of Files • Random-access data files • Files designed to facilitate future retrieval of individual records in an arbitrary order. • Retrieving a record requires determining its location • Hashing • Index (ISAM) • Thus, it is logically used like an array but data resides on disk • Binary data files • A machine-readable format that can contain almost anything • Unreadable if loaded in a word processor or text editor • Program files • Special type of binary file that the OS can interpret with the right OCXs, DLLs, EXEs.

  5. Different Types of Files • Sequential data files • Data elements stored and retrieved one after another, in order • Examples • Text files (Report-Record or Display-Formatted format) • Simple documents such as README files created with ASCII text editor like Notepad • Designed to be read, so there are no field delimiters • Carriage return and line feed mark the end of each line • Comma-separated value (CSV) files • Common means for transferring data across applications • Contains variable length records separated by commas • Certain types of data are surrounded by special characters • Carriage return and line feed mark the end of each line/record • Fixed-width data files • Commonly created from/for COBOL programs

  6. General usage of Files 1) Assign a file number • To refer to file in subsequent file access statements 2) Open the file • Identify file name & path, purpose, reference number, and record length if random file 3) Read/Write the file (possibly within a loop) • Specify reference number, action, and data elements • Read retrieves data from the file into specified memory locations (variables) • Write copies data in specified memory locations (variables) to current location in file 4) Close the file • Specify reference number to release the number and the file

  7. Assigning a File Number • Incorrect Dim InNbr As Integer Dim OtNbr As Integer InNbr = FreeFile ' Returns 1. OtNbr = FreeFile ' Returns 1. Open "Input.dat" For Input As #InNbr Open "Output.dat" For Output As #OtNbr • Correct Dim InNbr As Integer Dim OtNbr As Integer InNbr = FreeFile ' Returns 1. Open "Input.dat" For Input As #InNbr OtNbr = FreeFile ' Returns 2. Open "Output.dat" For Output As #OtNbr • Two choices • Use a literal constant (between 1-511) • Use FreeFile function to have VB automatically assign an available reference number • Use caution when working with multiple files simultaneously

  8. Opening a File • Provide information about the file before any access Open pathname For mode {Access access} {lock} _ As #filenumber {Len = recordlength} • Elements of Open statement • pathname: a string that defines the file name with full path • mode: (purpose: input, output, append, random, binary) • Establishes record pointer at start of file for all modes but append which starts at end of file • access: specifies allowable operations (needed with random) • lock: specifies allowable operations by other processes (n/a) • filenumber: reference number used in subsequent read/write operations • recordlength: number of bytes in a record of a random access file or buffer size for sequential files

  9. Reading from a File • CSV file Input #filenumber, VariableList • Data read from current location of file pointer into corresponding variables • Each comma or end-of-line delimits each field, unless surrounded in quotations • Normally, records were created using the Write statement

  10. Reading from a File • Report (display-formatted) file Line Input #filenumber, VariableName • Read the next line from current location of file pointer into specified variable • Normally, records were created using the Print statement • Fixed record-length file Get #filenumber, {recordnumber}, VariableName • Read the next record from current location of file pointer into specified variable • Needed by fixed-record length files since there is no end-of-line character • Normally, records were created using the Put statement

  11. Detecting End-of-File • Use EOF function • Special function that detects when the end of file has been reached EOF(filenumber) • Use Data sentinel • Record that contains specific, fixed values, which, when read, signal the last record has been processed • Use unique first record that defines the number of records that follow • Atypical approach that allows counting loop to be used to read the fixed number of records • Error handler to detect read past end of file error

  12. Writing to a File • CSV file Write #filenumber, VariableList • Data in variable list written to current file location • Comma inserted between fields, carriage-return and line feed inserted at end of record, quotations placed around strings and # around Booleans and dates • Normally, records will be read using Input statement

  13. Writing to a File • Report (display-formatted) file Print #filenumber, OutputList • Print the output list to next line in file unless previous Print ended in comma or semicolon • May contain expressions, Spc(n), Tab(n), separated by comma or semicolon • Normally, records will be read using Line Input statement • Fixed record-length file Put #filenumber, {recordnumber}, VariableName • Write the data in VariableName to current file location • Normally, records will be retrieved using the Get statement

  14. Closing a File • Notifies the operating system that you are done with the file Close {{#}filenumberlist} • Examples • Close #EmpFile, #TaxFile • Close • For output files, the buffers are flushed to insure that everything gets written to the file. • When reading the same file more than once in a single session, you must open and close each time.

  15. Reading Files with pre-data Open file for input (#1) Input #1, EndVal Set Count = 1 Count > EndVal? False Input #1, Array(Count) True Close #1 Other steps to process in loop Count = Count + 1 • When pre-data is included at start of data file or the program knows the number of records, use For/Next counting loop • Loop condition compares loop counter against the number of records read so far ‘ use static arrays and literal RefNbr Open “file” For Input As #1 Input #1, EndVal For Count = 1 To EndVal Step 1 Input #1, Array(Count) ‘ steps to process repeatedly Next Count Close #1

  16. Reading Files with post-data data Open file for input (#1) Input #1, TempVar Set Count = 0 False TempVar = Sentinel data Count = Count + 1 True Array(Count) = TempVar Close #1 Other steps to process in loop Input #1, TempVar Pre-test loop to check for sentinel data ‘ use static arrays and literal RefNbr Open “file” For Input As #1 Input #1, TempVar Count = 0 Do Until TempVar = SentinelData Count = Count + 1 Array(Count) = TempVar ‘ steps to process repeatedly Input #1, TempVar Loop Close #1

  17. Reading Files with no pre/post data Open file for input (#1) Set Count = 0 EOF(1) False Count = Count + 1 True Input #1, Array(Count) Close #1 Other steps to process in loop Pre-conditional loop to check EOF ‘ use static arrays and literal RefNbr Open “file” For Input As #1 Count = 0 Do Until EOF(1) Count = Count + 1 Input #1, Array(Count) ‘ steps to process repeatedly Loop Close #1 **This is the most likely approach when files are used

  18. How does reading a CSV file work? • Open statement • Record pointer set to start of file (before first record) • Each time an input statement is processed • For each variable listed • Type of data in field must be compatible with the data type of the corresponding variable where it will be stored • Data at file’s record pointer is read and stored in corresponding, named field (variable) • Record pointer moves to the start of the next field (whether on same line or next line) • Close statement to release file to OS

  19. VB Statements to Read CSV File ' Use dynamic array and FreeFile function Dim EmpCount As Integer Dim CsvInFile As Integer CsvInFile = FreeFile ' Returns 1. Open “a:\sample.csv” For Input As #CsvInFile Do Until EOF(CsvInFile) EmpCount = UBound(fEmp) + 1 ReDim Preserve fEmp(0 To EmpCount) As EmpType Input #CsvInFile, fEmp(EmpCount).ID, _ fEmp(EmpCount).Name, fEmp(EmpCount).Rate, _ fEmp(EmpCount).Hours ' optional code to process employee’s data Loop Close #CsvInFile

  20. How does writing a CSV file work? • Open statement • New file created in stated path with given name • Output mode • Record pointer is set to the start of the file • Append mode • Record pointer is set to the end of the file • Each time a write statement is processed • For each variable listed • Data in memory (variable) is written at record pointer • Strings enclosed in double quotes & Boolean and dates enclosed in # • Comma separator used unless last variable in list • Carriage return-line feed mark the end of the line • Close statement to release file to OS

  21. How does reading a report file work? • Open statement • same as writing CSV file • Each time a print statement is processed • Start on a new line unless the previous print ended in a comma “,” or semicolon “;” • For each expression listed • Data is written at current file pointer • Expressions separated by semicolons “;” keep file pointer where it left off • Expressions separated by commas “,” move pointer to next print zone • Close statement to release file to OS

  22. VB Statements to Write CSV File ' Use dynamic array and FreeFile function Dim Index As Integer Dim EmpCount As Integer Dim CsvOutFile As Integer CsvOutFile = FreeFile ' Returns 1. Open “a:\sample.csv” For Output As #CsvOutFile EmpCount = UBound(fEmp) For Index = 1 To EmpCount Step 1 Write #CsvOutFile, fEmp(EmpCount).ID, _ fEmp(EmpCount).Name, fEmp(EmpCount).Rate, _ fEmp(EmpCount).Hours Next Index Close #CsvOutFile

  23. Before using Sequential Files • You must determine or plan its structure • Number of records • To determine type of loop processing & size of array(s) needed • Number, order, and type of fields per record • To determine the fields (generally on one line) • if different types: for the user-defined array • if same type: number of columns for 2-d array • Pre-data • Generally the number of following records (counting loop) • Post-data (Sentinel data) • If present, to flag when the last good data set has been reached • If no pre-data or post-data, may use EOF function or write error handler to determine when end of file reached

More Related