- 93 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about 'Binary Trees' - mahola

Download Now**An Image/Link below is provided (as is) to download presentation**

Download Now

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

### Binary Trees

### EXERCISE IN VISUAL STUDIO

### Huffman Trees: Binary Trees for Compression

CS 1037

Fundamentals of Computer Science II

TexPoint fonts used in EMF.

Read the TexPoint manual before you delete this box.: AAAAAAAAAAA

What is a “Tree”?

- A tree is a graph with no cycles
- A rooted tree is a tree with one node r designated the root
- Choice of root defines ancestry relationships

two trees (a “forest”)

a tree

not a tree

r

r

Tree Properties

- Any two nodes are connected by a path
- A tree with n nodes has n¡1 edges
- If any edges are removed, the tree becomes a forest
- If any edge is added, the graph has a cycle and is no longer a tree

Rooted Tree Terminology

- Node y is an ancestor of x if it appears on the path r!x
- Node x is a descendent of y if y is an ancestor of x
- The subtree rooted aty is the tree of y and its descendents
- Node y is the parent of x if it is immediate ancestor of x
- Node x is a child of y if the parent of x is y

r

y

x

r

y

Rooted Tree Terminology

- A node w/ descendents is internal
- A node w/o descendents is a leaf
- The depth of node x is the number of edges on the path r!x
- The height of a tree is the largest depth of any node

r

r

x

depth(x)=2

Binary Tree Terminology

- A tree is k-ary if all its nodes have ·k children
- A tree is binary if it is 2-ary
- A child node is called either the left child or the right child
- A k-ary tree is full if all internal nodes have k children
- A tree is complete if it is full and all leaves have same depth

3-ary tree

left

right

left

right

Why Study Trees?

- Abstract representation of hierarchy
- Hierarchies natural in many applications
- networks, taxonomies, decision making, graphics
- Even when not natural, hierarchies crucial for performance (binary search trees)

C:

com

ca

edu

users

programs

music

uwo

delong

skype

gaga

a “shader tree” for 3D rendering

eng

csd

cs1037

radiohead

Binary Tree Data Structures

- A linked data structure (like linked list)
- Each node has up to two successors: its left child and right child

struct node {

node* left; // pointer to root of left subtree

node* right; // pointer to root of right subtree

... // application-specific stuff };

root

left

right

data

data

data

data

data

data

Binary Search Trees (BSTs)

- A BST is binary tree where nodes contain an item and are ordered in a special way
- Main goal: data structure that supports fast insert, erase, and search
- Array-based binary search does not support fast insert/erase!

height

¼lgn

* assuming uniformly distributed items (completely random)

The Binary Search Tree Property

- A tree is a BST if it satisfies the binary search tree property:
- Let x be a node in a BST.
- If y is a node in the left subtree of x, then y.item <= x.item.
- If y is a node in the right subtree of x, then y.item >= x.item

3

2

7

4

9

9

2

7

3

4

4

3

7

2

9

BST Search (iterative)

struct node {

node* left; // left subtree has values <= item

node* right; // right subtree has values >= item

int item; // item for this node};

node* root = ...;

node* search(int x) {

node* n = root;

while (n) {

if (x < n->item)

n = n->left; // look in left subtree

elseif (x > n->item)

n = n->right; // look in right subtree

else

break; // found match!

}

return n;

}

BST Search (recursive)

- Follows one path down the tree
- At most 2(h+1) tests, where h is height of the BST

node* search(node* n, int x) {

if (!n)

return 0; // fell off bottom of tree; no match

if (x < n->item)

return search(n->left,x); // search left subtree

if (n->item < x)

return search(n->right,x); // search right subtree

return n;

}

3

2

7

4

9

node* result = search(root,4);

cout << result->item; // prints "4"

Running Time of BST Search

- If tree is well-balanced, at most clgn time
- If items added in random order, tree well-balanced on average
- If tree is highly skewed, up to cn time
- If items added in sorted order, tree will be completely skewed

BST Insert (recursive)

- At most h+1 tests, where h is height of the BST
- May increase height of tree!

void insert(node*& n, int item) {

if (!n) {

n = new node; // we hit bottom,

n->item = item; // so insert here

n->left = n->right = 0; // (no children yet)

} else if (item < n->item)

insert(n->left,item); // item belongs to left

else

insert(n->right,item); // item belongs to right

}

insert(root,5);

root

3

2

7

4

9

5

BST Erase Examples

erasemust maintain binary search tree property!

erase(root,5)

erase(root,4)

erase(root,3)

case 1: node is a leaf (trivial: delete 5)

case 2: has only one child (easy: unlink, then delete 4)

case 3: has two children (hard: can’t just unlink 3)

3

3

3

3

3

4

2

7

2

2

2

2

2

7

7

7

7

7

4

9

4

5

5

4

4

9

9

9

9

9

5

5

5

(you don’t want to see the “efficient” version!)

BST Erase (simple version)void erase(node*& n, int item) {

if (!n)

return; // no match, ignore erase

if (item < n->item)

erase(n->left,item); // match must be on left

else if (n->item < item)

erase(n->right,item); // match must be on right

else if (!n->right) {

node* temp = n; // case 1 or 2:

n = n->left;// bypass n to left subtree

delete temp; // (possibly NULL)

} else if (!n->left) {

node* temp = n; // case 2:

n = n->right;// bypass n to right subtree

delete temp;

} else {

node* successor = n->right; // case 3: get smallest

while (successor->left) // value in right subtree

successor = successor->left; // by descending leftward;

n->item = successor->item; // copy its value to n and

erase(n->right,successor->item); // delete the easy node instead

}

}

See Snippet #1

Binary Tree Exercise

- 39. [6 marks] Write a function to print the items of a binary tree in
- level-order (all items at depth 0, then all items at depth 1,
- then all items at depth 2…). Hint: use a queue!
- void print_levelorder(node* root) {
- }

- queue<node*> q;
- if (root)
- q.push_back(root);
- while (!q.empty()) {
- }

G

GDYAWZ

D

Y

A

W

Z

node* n = q.front(); q.pop_front();

cout << n->item;

if (n->left)

q.push_back(n->left);

if (n->right)

q.push_back(n->right);

A Totally Different Application/ Interpretation of Binary Trees

0

1

a

0

1

b

c

Compression Problem

Q: Given list of symbols {a,b,c,...} of size n,

what is the shortest string of {0,1} bits

that uniquely identifies string baabac?

Easy Answer:

Use fixed-length binary code of dlgne bits

baabac

a:00

b:01

c:10

{a,b,c}

n=3

01¢00¢00¢01¢00¢10

12 bits

binary code

Compression Problem

Smart Answer: Use variable-length codes...

Frequent symbols should have shorter

binary codes than infrequent symbols

Need estimate of symbol frequencies!

a: 3 times, b: 2 times, c: 1 time

baabac

baabac

10¢0¢0¢10¢0¢11

10¢11¢11¢10¢11¢0

good prefix code

bad prefix code

9 bits

11 bits

a:0

b:10

c:11

a:11

b:10

c:0

Optimal Binary CodeProblem

- Given symbols S={a,b,c,...} and expected frequencies f(x), which binary code achieves best expected compression?
- Answer discovered in 1951 by MIT student David A. Huffman
- Build a special binary tree:

0

1

a:0

b:10

c:11

f(a)=3

f(b)=2

f(c)=1

)

)

a

0

1

David Huffman, 1991

b

c

frequencies

Huffman tree

optimalcode

Huffman Codes

- Observation: 1-to-1 correspondence of possibly optimal codes & full binary trees
- Huffman’s algorithm uses f(x) to build an optimal binary tree (a Huffman tree), and thereby optimal binary code!

a:0

b:10

c:110

d:111

1

0

a:00

b:01

c:10

d:11

a:00

b:010

c:011

d:1

1

0

1

0

a

1

0

d

1

1

1

0

0

0

b

1

0

a

b

c

d

a

1

0

c

d

b

c

Optimal Binary Code Problem (Formal)

- Input: set of symbols S={a,b,c,...}, and

frequencies f(x) for each x2S

- Output: binary codes c(x) such that, for string s[0..n-1] its compressed size

is minimized.

|y| means length of code y=c(s[i]) for string character s[i]

e.g. recall min size(baabac) = 9 bits

Huffman’s Algorithm

- Start with list of single-node trees
- Take roots i and j with smallest f and make new root with f =fi+ fj
- While not a single tree, repeat step 2

x:f(x)

a:3

b:2

c:1

d:7

3

a:3

d:7

b:2

c:1

a:00

b:010

c:011

d:1

13

6

d:7

a:3

3

b:2

c:1

Huffman Tree in C++

struct node {

node* left; // ptr to root of left subtree

node* right; // ptr to root of right subtree

char symbol; // symbol represented by this node

double frequency; // total frequency of symbols in

}; // subtree rooted at this node

internal nodes (no symbol)

root

1.0

0

1

0.6

b:0.4

-1

1.0

0

1

leaf nodes

-1

0.6

a:0.3

c:0.3

'b'

0.4

'a'

0.3

'c'

0.3

left

right

sym

freq

Huffman Tree Operations

- build(map<char,double> f)
- build optimal tree with each symbol cand its frequency estimate f[c]
- string encode(string s)
- build string of binary codes from symbols s[i]
- string decode(string b)
- build string of symbols from binary string b

std::map is STL data structure

"baabac"

10¢0¢0¢10¢0¢11

"baabac"

10¢0¢0¢10¢0¢11

Huffman Code Summary

- Optimal way to compress when each symbol is independently sampled from distribution f
- however, most real data is not independent!
- in English, is any particular letter likely to be 'u'? what if you knew preceding letter was 'q'? ...
- Used everywhere in compression:
- image compression (JPEG/PNG/ZIP), networking, text compression (English compressed to ~40% of size)
- Totally different from BST, yet still binary tree!
- http://en.wikipedia.org/wiki/Huffman_coding

Download Presentation

Connecting to Server..