file i o low level n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
FILE I/O: Low-level PowerPoint Presentation
Download Presentation
FILE I/O: Low-level

Loading in 2 Seconds...

play fullscreen
1 / 59

FILE I/O: Low-level - PowerPoint PPT Presentation


  • 159 Views
  • Uploaded on

FILE I/O: Low-level. General Ideas. High vs. low-level Opening a file Closing a file Writing data to the file Reading data from the file (numerical, strings.. Etc ). 1. High vs low level. High level. Low level. Requires at least 3 lines! c lc c lear %open the file

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'FILE I/O: Low-level' - adia


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
file i o low level

FILE I/O: Low-level

General Ideas. High vs. low-level

Opening a file

Closing a file

Writing data to the file

Reading data from the file(numerical, strings..Etc)

1 high vs low level
1. High vs low level

High level

Low level

Requires at least 3 lines!

clc

clear

%open the file

__ = fopen(_________);

%grab the data ‘properly’

______(a lot of options)_____

%close the file

fclose(_____);

%ready to analyze the data…

  • “1 line of code opens the file, reads/write the data, closes the file”

clc

clear

%upload data from file

data = dlmread(‘myfile.dat’);

%ready to analyze the data..

1 low level cont
1. Low-Level, cont.
  • Some files are mixed format that are not readable by high-level functions such as xlsread() and dlmread()!
  • Since the data is not easily recognized by any high-level function, every step to read the file requires a separate MATLAB command:
    • Open a file, either to read-from or to write-to
    • Read or writedata from/to the file with the specific delimiters
    • Close the file
the order of today
The order of today..
  • first, learn the fopen() command
  • then, learn the fclose() command
  • dump data to a file - fprintf()
  • read string data - fgets() and fgetl()
  • read numerical data -fscanf()
  • read combination of numerical and string - textscan()
2 opening closing files
2. Opening / Closing Files
  • "Open a file"
    • Program requests access to a file (from the OS) for reading and/or writing data to/from the file.
  • "Close a file"
    • Inform the OS that the program is finished working with the file.
2 opening a file
2. Opening a file
  • There are many syntaxes possible (from the doc file):
  • The most commonly used ones (in EGR115):

Description

fileID= fopen(filename); %opens the file filename for read access, and returns an integer file identifier.

fileID= fopen(filename, permission); %opens the file with the specified permission.

2 opening a file1
2. Opening a file
  • Syntax to open a file

fileID = fopen( filename , permission );

    • fileID is known as a file identifier. It is like a "nickname" and is used instead of the filename when actually working with the file.
    • filename represents a string that is the name of the file with its extension (the letters after the dot). It can either be hardcoded, or within a variable
    • permission is a string that describes the type of access for the file

>> why is this needed?

2 opening files permission
2. Opening Files:permission
  • The “permission” indicates how MATLAB will use the file after being opened.
  • Since the operating system has permissions (read-only, write-only) assigned to files, when you request access to a file you must tell the system in what mode you will be using the file.
  • The strings used for this tell the OS what it needs to know, and has an impact on how MATLAB will use the file.
2 opening files permission1
2. Opening Files: permission
  • The most commonly used stringsare

‘r’ Open file for reading (default).

‘w’ Open or create new file for writing. Discard existing contents, if any.

‘a’ Open or create new file for writing. Append data to the end of the file.

  • There are MANY more

>> doc fopen<enter>

2 opening files file position pointer
2. Opening Files: File Position Pointer
  • What happens when MATLAB opens a file to read or write?
    • When a file is opened, a “file position pointer” is created. The system keeps track of the point in the file to which your program has read or written.
    • Think of it like a cursor that moves as you read or write the file.
    • The file position pointer is set initially to different locations depending on the permission granted.
2 opening files file position pointer1
2. Opening Files: File Position Pointer
  • Depending on the access-mode, does a file get “wiped” or not, “created” or not?
  • You should be able to reason this out – memorization is not the key here!

Access Mode Delete Content? Create File?

r

w

a

This is the only ‘tricky’ one. Think: ‘w’=write=wipe.

2 opening files choosing a permission
2. Opening Files: Choosing a permission
  • Trivia

A “log file” is a file that keeps a history of events. Many programs keep log files. They help programmers see what occurred in the past so that a problem can be fixed.

For example, swiping for attendance creates a log-file.

If your program is going to keep a log file, what is the best mode to use when opening this file? Why?

2 closing files
2. Closing Files
  • Syntax:

fclose(fid); %usually ignore the return value

  • After working with a file, it is important to close the file. Other than being good form, it is critical when writing to the file.
  • Remember this? "Safe remove" warning on USB drives.
    • When the OS is supposed to put information on disk it frequently waits until it determines the best time. This is known as "write caching".
    • Windows may wait to write data. If your program finishes and Windows hasn't written this data, it will not be written at all!

>> Closing the file forces Windows to write the data to the disk.

2 examples of opening and closing
2. Examples of opening and closing

% Example 1) open a file from which to read

fileGrades = fopen('grades.txt', 'r'); %hardcode the filename

<code block to be inserted here>

%close file

fclose(fileGrades);

% Example 2) ask user for a filename, then open it to read

nameFile = uigetfile(‘*.txt’);

fileGrades = fopen(nameFile, 'r');

<code block to be inserted here>

%close file

fclose(fileGrades);

Use the file handle – not the file name!

no quotes: a variable

Notice that the file handle variable can be any acceptable variable name

3 writing to text files
3. Writing to Text Files

fprintf(<file handle>, … The rest is as usual...);

Don’t forget the semi-colon!

Otherwise, MATLAB displays in the command window a number! fprintf() default output is how many characters were printed.

Example:

fh = fopen('log_file.txt', 'a');

for k = 1:nbEvents

fprintf(fh,'Event #%d: %15s %s\n',k,events{k,1},events{k,2});

end

fclose(fh);

File handle – not the file name!

3 ms windows text files
3. MS Windows Text files
  • When writing to a text file, MATLAB will write only a single newline character to the end of a line.
  • Yet, a Windows software (like Notepad) requires two different characters at the end of a line.
  • If you choose to open the file in a Windows based software, pad, it will not look like you expect:
  • But if you open with WordPad…
3 ms windows text files1
3. MS Windows Text files
  • There is nothing wrong with this – unless you intend to work with the file outside of your program (and in Windows).
  • To make it Windows-ready, write both a carriage return (\r) and a newline (\n):
3 writing text files
3. Writing Text Files

Inserting data into the middle of a text file

Writing to text files is not like working in Word!

When you write to a text file, the data added to the file will write over any existing data in the file after the files position pointer – there is no “insert mode”!

3 writing text files1
3. Writing Text Files

What we think should happen…

3 writing text files2
3. Writing Text Files

What REALLY happens…

3 writing text files3
3. Writing Text Files

There is no quick-fix to this problem.

You must write code that moves the existing file data so that you can insert the new data. This might mean copying to a new file, or looping and overwriting the old data.

4 reading text files
4. Reading text files

LAST but not least…

  • There are many ways to read from a file, due to the infinite possibility of its content.
    • Numbers? (Remember to use dlmread() if there is ONLY nbs!)
    • Strings?
    • Numbers and strings?
    • Pattern?
    • No pattern?
  • Therefore, there are many built-in functions, and ALL can be used in combinations, repeatedly, within a loop… to make it work!
4 reading files
4. Reading files
  • Examples of files requiring low-level functions
  • It’s all about moving that cursor from top to bottom, and grabbing the data as you go!
lots of built in functions
Lots of built-in functions
  • There are many functions out there! Per line in the file:
    • Some are good with strings

fgets() – grabs an entire line including \n

fgetl() – grabs entire line, stops cursor before \n

    • Some are good with numbers

fscanf(), textscan() – can return a numerical array

    • Some are good with strings & numbers

textscan() - returns 1 row cell-array

EACH OF THESE GRAB DATA AND MOVE THE CURSOR – much like when you read a book!!!!

example1
Example1
  • Which function to use:
    • Line 1? _______
    • Line 2? _______
    • Lines below? _______
move the cursor one line at a time
Move the cursor one line at a time!

clc

clear

%open file

fileID = fopen('Electricallab.txt'); %'r' by default

%grab data

%line1

line1 = fgets(fileID);

%line2

line2 = fgets(fileID);

%below

data = fscanf(fileID,'%f %f',[2,inf])';

%close file

fclose(fileID);

%now analyze the data!

avg_volts = mean(data(:,2))

%somewhere, use line1 and line2… otherwise, ignore the return values above, and don't waste memory.

%fgets(fileID); %for line1

%fgets(fileID); %for line2

example2
Example2
  • We only want columns 1 and 10…
move and skip
Move and Skip*!

clc

clear

%open file

fileID = fopen('10Columns.txt'); %'r' by default

%grab data

%lines1-4

for k = 1:4

fgets(fileID); %notice don't collect. useless. i don't use it later.

end

%below

data = fscanf(fileID,'%f %*d %*f %*f %*d %*d %*d %*f %*f %f',[2,inf])';

%close file

fclose(fileID);

example3
Example3

line 1? _______

lines 3-end? _____ 

textscan friendly
textscan() friendly

{1}

{2}

{3}

clc

clear

%open file

fileID = fopen('exampleIDName.txt'); %'r' by default

%grab data

%line1

line1 = fgets(fileID);

dataCellArray = textscan(fileID,'%d %s %d');

%CAREFUL. data types of cell1 and 3 are int32!!! use double() to convert.

%close file

fclose(fileID);

%who has highest grade???

%cellplot(dataCellArray) %I PLAY A LOT WITH THIS TO VISUALIZE!! SO COOL!

[maxi, rowNb]=max(dataCellArray{3});

nameMax = dataCellArray{2}{rowNb} %{} everywhere, since cell2 contains another cellarray.

IdofMax = dataCellArray{1}(rowNb) %regular () since cell1 contains a good old numerical vector

columInCell1 = double(dataCellArray{1}); %extract the content of cell1 (which happens to be a vector)

(2)

(1)

(3)

extremely picky
EXTREMELY PICKY!!!!
  • CAUTION: fscanf() and textscan() are EXTREMELY picky when it comes to the format string. Even 1 extra space or missing character could throw MATLAB off.
  • Consider this file:

data=textscan(fid,‘%s %d %d %d AM’); %would not work

data=textscan(fid,‘%s %d %d:%d AM’); %would work

data=textscan(fid,‘%s %d %d:%d %s’); %would work

data=textscan(fid,‘%s %d %d:%d %c%c’); %would work

data=textscan(fid,‘%s %d %d:%d %c %c’); %would NOT work

more picky examples
More picky examples…

File has:

Jamie 12:34:44 $45.23

Carol 4:56:54 $23.21

'%s %d:%d:%d $%f'

File has:

34-22 - 532.5 *34.53

12-33 - 111.5 *22.34

'%d-%d - %f *%f'

4 reading text files strings
4. Reading text files - Strings
  • Reading an entire line as a string
    • including storing the new line character in the variable

str = fgets(<file handle>);

    • without storing the new line character in the variable

str = fgetl(<file handle>);

>> Both function calls above move the cursor down to the next line (you can use this to “skip” lines by ignoring its return value!)

using fgets
Using fgets()
  • Includes the new line character in your variable

Suppose we had this data file (.txt file)

And ran this program:

fh = fopen('testdata.txt', 'r');

x = fgets(fh);

fprintf(‘->%s<-', x); %old fprintf

Notice there are TWO newlines in the variable:

using fgets1
Using fgets()
  • The important idea though is that MATLAB moved the cursor past the first line. Since the file has not been closed, MATLAB is ready to scan the second line!
              • Use another fgets() to grab the next line..
              • Or write a for loop to do repeat the fgets()

How can we get rid of the \r\n ?

using fget l
Using fgetl()
  • Reads past the newline, but DOES NOT include the newline character in your variable

And ran this program:

fh = fopen('testdata.txt', 'r');

x = fgetl(fh);

fprintf('->%s<-', x) %old fprintf

4 reading text files numerical data
4. Reading text files – Numerical data

fscanf() is like the reverse of fprintf(). Specify the format you want to match and fscanf() will read from the file as long as it can match that format.

fscanf() is not good for reading strings because it will save the characters as their ASCII equivalents.

Returns 1 numerical array (2D if necessary).

That’s why it doesn’t like strings, usually of different length!

4 using fscanf part1 of 7
4. Using fscanf() – part1 of 7

So:

After opening the file, and moving the cursor past the first line (using fgets() for example), read the contents using:

%open file (read by default)

fh = fopen('example.txt');

%move cursor past first line (ignore return value)

fgets(fh);

%load the numerical data

data = fscanf(fh, '%d %d');

fclose(fh); %done with file, close it before continuing…

4 using fscanf part2
4. Using fscanf() – part2

However, the result would be:

This demonstrates that fscanf() reads the data in line-order, but then stores it as a column. Change this format using one more argument on the function call.

4 using fscanf part3
4. Using fscanf() – part3

Change the function call to:

data = fscanf(fh, '%d\t%d', [2, 3])

The return-value collected is now:

MATLAB is still reading the data in line-order, and still storing the data in column-order, but we've now specified how big the columns will be – two rows each.

Add this 3rd argument

4 using fscanf part4
4. Using fscanf() – part4

We may want the data to be in the form of the file. Unfortunately, changing the third argument doesn’t help:

data = fscanf(fh, '%d\t%d', [3, 2])

  • Original file data:
  • This is because fscanf() is still filling the variable in “column-order” – it fills a column first and then moves onto the next column.
4 using fscanf part5
4. Using fscanf() – part5

To fix this, first read it in as a 2x3 matrix:

data = fscanf(fh, '%d\t%d', [2, 3])

Then transpose the matrix:

data = data'

4 using fscanf part6
4. Using fscanf() – part6

Usually, combine all in one line:

data = fscanf(fh, '%d\t%d', [2, 3])’

4 using fscanf part7
4. Using fscanf() – part7
  • Suppose the number of lines in the files is unknown, or more importantly is constantly updated!
  • Use MATLAB’s inf constant (infinity). It means “as many as needed”

data = fscanf(fh, '%d\t%d', [2, inf])’;

  • Now, if the data file gets larger, the program can still handle it.
fscanf caution
fscanf() CAUTION
  • It is impossible to skip to a specific column only.

For example: There are 10 columns in the file, and all you need is columns 3 and 7. Too bad.. You must code 10 placeholders no matter what.

DO NOT do this:

Data = fscanf(fid,’%f %d %f %f %d %d %d’,[7, inf]);

That will move the cursor past the 7th column, then start scanning from there again!!

Do this:

Data = fscanf(fid,’%*f %*d %f %*f %*d %*d %d %*f %*f %*f’,[2, inf]); %the * tells MATLAB to read past this part without storing it within Data. BUT THERE ARE 10 PLACEHOLDERS

5 reading files strings numbers
5. Reading files – Strings & Numbers
  • Assume the following file:
  • Again, move past the first line using fgets() or fgetl(), then how should we grab both integers, and strings?
    • Remember fscanf() is NOT friendly, it will return ASCII values!
5 using textscan
5. Using textscan()
  • textscan() is similar to fscanf() but is friendly to strings and numbers.

There is still 1 return value, but this time it is a cell-array (capable of having strings and numbers of any size!)

5 using textscan1
5. Using textscan()
  • Assume this updated data file:
  • After opening the file, and moving the cursor past the first line (using fgets() for example), read the contents using:

data = textscan(fid,'%d %s %d')

5 using textscan2
5. Using textscan()
  • The return value is not a 2D cell-array, but rather 1 single row:
  • To extract the data, simply reference the one cell you’re interested in, using { }. For example:

allNames = _________________

allGrades= _________________

allIDs = _________________

extremely picky1
EXTREMELY PICKY!!!!
  • CAUTION: fscanf() and textscan() are EXTREMLY picky when it comes to the format string. Even 1 extra space or a missing character could throw MATLAB off.
  • Consider this file:

data=textscan(fid,‘%s %d %d %d AM’); %would not work

data=textscan(fid,‘%s %d %d:%d AM’); %would work

data=textscan(fid,‘%s %d %d:%d %s’); %would work

data=textscan(fid,‘%s %d %d:%d %c%c’); %would work

5 using textscan option1
5. Using textscan() – option1
  • HOWEVER, YOU (programmer) have much much more control over how the information is taken.

data=textscan(fid,'%s %d %d:%d AM')

4 placeholders

=

4 columns

5 using textscan option2
5. Using textscan() – option2
  • Note: with low-level, YOU (programmer) have muchmuchmorecontrol over how the information is taken.

data=textscan(fid,'%s %d %d:%d %s')

5 placeholders

=

5 columns

5 using textscan option3
5. Using textscan() – option3
  • Note: with low-level, YOU (programmer) have much much more control over how the information is taken.

data=textscan(fid,'%s %d %s AM')

3 placeholders

=

3 column

textscan caution
textscan() CAUTION
  • There isn’t a way to read only UP TO the columns wanted.

For example: there are 10 columns in the file, and all you need is columns 3 and 7. Too bad..

Don’t do this:

Data = textscan(fid,’%s %d %s %s %d %d %d’);

That will move the cursor past the 7th column, then start scanning from there again (so column8)!!

Do this:

Data = textscan(fid,’%*s %*d %s %*s %*d %*d %d %*f %*f %*f’); %the * tells MATLAB to read past this part without storing it within Data. BUT THERE ARE 10 PLACEHOLDERS

other caution
Other caution
  • Use the simplest form of the placeholder.
    • %c
    • %s
    • %d
    • %f
  • DO NOT use format modifiers, such as:

.2 (as in %.2f)

- (as in %-20f)

10.3 (as in %10.3f)

Etc…

so far no loop
So far.. No loop
  • fscanf() and textscan() scan and repeat the format string. They stop when the pattern no longer matches.
  • fgets() and fgetl() do not. They scan 1 line only. You may have to write a loop to make it more efficient!

%move cursor past the 11th line (skip all first 11 lines)

for k = 1:11

fgets(fid); %ignore string returned

end

try all ideas at home
Try all ideas at home!
  • Assume this file, scan the data!
  • Read the file to filter the actual lines of captions. Rewrite the lines of caption in a new separate file!
key ideas low level functions
Key Ideas: Low Level Functions
  • used when a mixture of data is in the file
  • always require the use of fopen(), and fclose()
    • fopen() has mainly 3 permission mode: read, write, append.
  • There are many functions out there! Out of those seen:
    • Some are good with strings

fgets()

fgetl()

    • Some are good with numbers

fscanf(), textscan() – can return a numerical array

    • Some are good with strings & numbers

textscan() - returns 1 row cell-array

  • Note: rarely is the actual name of the file used, besides on the fopen() call. All other functions require the file identifier.