1 / 41

Huffman Codes

Huffman Codes. Using Binary Files. Getting Started. Last class we extended a program to create a Huffman code and permit the user to encode and decode messages. We will use that program as our starting point today:

necia
Download Presentation

Huffman Codes

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Huffman Codes Using Binary Files

  2. Getting Started • Last class we extended a program to create a Huffman code and permit the user to encode and decode messages. • We will use that program as our starting point today: • http://www.cse.usf.edu/~turnerr/Data_Structures/Downloads/2011_04_13_Huffman_Codes_with_Binary_IO/ • File Huffman_Code_with_Associative_Map.zip • Download, extract, built, and run.

  3. Program in Action Widen window to 100.

  4. Binary Output • Huffman codes are useful in real life only when we output the coded message as binary. • Let's modify do_encode to output a file. • Start with a text file. • ASCII 1's and 0's • Modify to write a binary file.

  5. main.cpp • Add at top of main.cpp: #include <fstream> • Modified version of do_encode: • http://www.cse.usf.edu/~turnerr/Data_Structures/Downloads/2011_04_13_Huffman_Codes_with_Binary_IO/do_encode.cpp.txt

  6. Modifications to do_encode() void do_encode(void) { string msg; string output_filename; ofstream outfile; string junk; while (!outfile.good()) { cout << "File name for output? "; cin >> output_filename; getline(cin,junk); // Skip newline char outfile.open(output_filename.c_str()); if (!outfile.good()) { cout << "Failed to open output file\n"; cout << "Please try again\n"; } }

  7. Modifications to do_encode() cout << "\n\nEnter message to encode\n"; getline(cin, msg); for (size_t i = 0; i < msg.length(); ++i) { char next_char = tolower(msg[i]); string code = huffman_tree.Encode_Char(next_char); cout << code; outfile << code; } cout << endl << endl; outfile << endl << endl; outfile.close(); cout << "File " << output_filename << " written\n"; }

  8. Clean Up Output • Comment out statements that output the tree and the code. • In main(): int main(void) { cout << "This is the Huffman Code Program" << endl; build_huffman_tree(); //huffman_tree.Display_List();

  9. In Huffman_Tree.cpp void Huffman_Tree::Make_Decode_Tree(void) { node_list.sort(); //cout << "\nSorted list:\n"; //Display_List(); ... //cout << endl << "The Huffman Tree" << endl; //Display_Decode_Tree(&decode_tree_root, 0); //cout << endl << "The Code: " << endl; //Display_Code(&decode_tree_root, ""); }

  10. Program in Action Examine c:\out.txt

  11. The Output File

  12. Invalid Characters • What should we do with characters that are not in the code? • Encode_Char() returns a zero length string. • Detect the error in do_encode(). • Tell user about the error. • Skip the invalid character in output.

  13. main.cpp In do_encode() for (size_t i = 0; i < msg.length(); ++i) { char next_char = tolower(msg[i]); string code = huffman_tree.Encode_Char(next_char); if (code.size() == 0) { cout << endl << "Invalid character in input to do_encode: " << next_char << endl; continue; } cout << code; outfile << code << " "; }

  14. Program Running

  15. The Output File

  16. Binary File I/O • Issues with binary files. • Hardware architecture dependencies. • Code is typically not portable. • Output is by byte, not by bit • For Huffman coding we need variable length bit strings. • Must know number of bits. • Encapsulate code to do binary file I/O in classes. • Provide relatively simple interface to the rest of the program.

  17. Bit Count Client Classes Bits Buffer Binary Output File Class Binary Output File Class

  18. Binary Input File Class Bit Count Client Classes Bits Buffer Binary Input File Class

  19. Binary_File is_open buffer next_bit_position filename BUFFER_SIZE FIRST_BIT_POSITION + Is_Open Binary_Output_File -fstream + Output_Bit_String + Close - Write_Buffer Binary_Input_File -fstream + Get_Next_Bit + Close -Read_Buffer Binary File Classes

  20. Binary File I/O • Download • http://www.cse.usf.edu/~turnerr/Data_Structures/Downloads/2011_04_13_Binary_File_IO/ • File Binary_File_IO_Classes.zip

  21. Binary File IO Classes Copy into project folder and add to project.

  22. Add Binary File IO Files to Project Build project.

  23. Binary_File.h #pragma once #include <string> using std::string; class Binary_File { public: Binary_File(const string& Filename); virtual void Close() = 0; bool Is_Open() const {return is_open;}; protected: static const int BUFFER_SIZE = 1024; // Size in bytes static const int FIRST_BIT_POSITION = 8*sizeof(size_t); union Buffer { char bits[BUFFER_SIZE]; size_t bit_count; }; void Reset_Buffer(void); const string filename; bool is_open; Buffer buffer; size_t next_bit_position; };

  24. Binary_File.cpp #include "Binary_File.h" Binary_File::Binary_File(const string& Filename) : is_open(false), filename(Filename) { Reset_Buffer(); } void Binary_File::Reset_Buffer(void) { for (int i = 0; i < BUFFER_SIZE; ++i) { buffer.bits[i] = 0; } next_bit_position = FIRST_BIT_POSITION; }

  25. Binary_Output_File.h #pragma once #include <iostream> #include <fstream> #include <string> #include "Binary_File.h" using std::string; class Binary_Output_File : public Binary_File { public: Binary_Output_File(const string& filename); void Output(const string& bit_string); void Close(); private: std::fstream outfile; void Write_Buffer(); };

  26. Binary_Output_File.cpp #include <cassert> #include <cmath> #include "Binary_Output_File.h" using namespace std; Binary_Output_File::Binary_Output_File(const string& filename) : Binary_File(filename) { outfile.open(filename.c_str(), ios::out | ios::binary ); if (outfile.fail()) { string err_msg("Error opening output file "); err_msg += filename; throw err_msg; } Reset_Buffer(); is_open = true; }

  27. Binary_Output_File.cpp void Binary_Output_File::Write_Buffer() { assert (is_open); if (next_bit_position == FIRST_BIT_POSITION) { return; } buffer.bit_count = next_bit_position - FIRST_BIT_POSITION; size_t nr_bytes = (size_t) ceil(next_bit_position / 8.0); outfile.write( buffer.bits, nr_bytes); Reset_Buffer(); }

  28. Binary_Output_File.cpp void Binary_Output_File::Output(const string& bit_string) { assert(is_open); for (size_t i = 0; i < bit_string.size(); ++i) { if (bit_string[i] == '1') { size_t byte_position = next_bit_position / 8; size_t bit_position_within_byte = next_bit_position % 8; buffer.bits[byte_position] |= (0x80 >> bit_position_within_byte); } else { assert(bit_string[i] == '0'); } ++next_bit_position; if (next_bit_position == BUFFER_SIZE*8) { Write_Buffer(); } } }

  29. Binary_Output_File.cpp void Binary_Output_File::Close() { Write_Buffer(); outfile.close(); is_open = false; }

  30. Using Binary File IO • Now let's modify do_encode() to write a binary file. • Add at top of main.cpp: #include "Binary_Output_File.h"

  31. do_encode() void do_encode(void) { string msg; string output_filename; Binary_Output_File* outfile; string junk; while (true) { cout << "File name for output? "; cin >> output_filename; getline(cin, junk); // Skip newline char try { outfile = new Binary_Output_File(output_filename); break; } catch (const string& msg) { cout << msg << endl; } }

  32. do_encode() cout << "Enter message to encode\n"; getline(cin, msg); for (size_t i = 0; i < msg.length(); ++i) { char next_char = tolower(msg[i]); string code= huffman_tree.Encode_Char(next_char); if (code.size() == 0) { cout << endl << "Invalid character in input to do_encode: " << next_char << endl; continue; } cout << code; outfile->Output(code); }

  33. do_encode() cout << endl << endl; outfile->Close(); delete(outfile); cout << "File " << output_filename << " written\n"; }

  34. Some Test Data

  35. Program in Action

  36. Bit Count = 16 0000 0001 0010 0011 c:\test.dat • Look at the output file in Visual Studio • File > Open > File

  37. Binary Input • Now let's modify do_decode() to read a binary input file rather than reading 1's and 0's from the keyboard. • Add at top of main.cpp: #include "Binary_Input_File.h" http://www.cse.usf.edu/~turnerr/Data_Structures/Downloads/2011_04_13_Huffman_Codes_with_Binary_IO/do_decode.cpp.txt

  38. do_decode() void do_decode(void) { string msg; string input_filename; Binary_Input_File* infile; string junk; while (true) { cout << "File name for input? "; cin >> input_filename; getline(cin, junk); // Skip newline char try { infile = new Binary_Input_File(input_filename); break; } catch (const string& msg) { cout << msg << endl; } }

  39. do_decode() string coded_message = ""; string original_message; while (infile->Is_Open()) { int next_bit = infile->Get_Next_Bit(); if (next_bit < 0) break; if (next_bit == 0) { coded_message += "0"; } else { coded_message += "1"; } } original_message = huffman_tree.Decode_Msg(coded_message); cout << "Original message: " << original_message << endl; cout << endl << endl; }

  40. Reading a Binary File

  41. Another Example

More Related