- 73 Views
- Uploaded on
- Presentation posted in: General

Data Structures and Algorithms

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Data Structures and Algorithms

Graphs I:

Representation and Search

Gal A. Kaminka

Computer Science Department

- Reminder: Graphs
- Directed and undirected

- Matrix representation of graphs
- Directed and undirected

- Sparse matrices and sparse graphs
- Adjacency list representation

- Tuple <V,E>
- V is set of vertices
- E is a binary relation on V
- Each edge is a tuple < v1,v2 >, where v1,v2 in V

- |E| =< |V|2

- Directed graph:
- < v1, v2 > in E is ordered, i.e., a relation (v1,v2)

- Undirected graph:
- < v1, v2 > in E is un-ordered, i.e., a set { v1, v2 }

- Degree of a node X:
- Out-degree: number of edges < X, v2 >
- In-degree: number of edges < v1, X >
- Degree: In-degree + Out-degree
- In undirected graph: number of edges { X, v2 }

- Path from vertex v0 to vertex vk:
- A sequence of vertices < v0, v1, …. vk >
- For all 0 =< i < k, edge < vi, vi+1 > exists.
- Path is of length k

- Two vertices x, y are adjacent if < x, y > is an edge
- Path is simple if vertices in sequence are distinct.
- Cycle: if v0 = vk
- < v, v > is cycle of length 1

- Undirected Connected graph:
- For any vertices x, y there exists a path xy (= yx)

- Directed connected graph:
- If underlying undirected graph is connected

- Strongly connected directed graph:
- If for any two vertices x, y there exist path xy
and path yx

- Clique: a strongly connected component

- If for any two vertices x, y there exist path xy
- |V|-1 =< |E| =< |V|2

- Graph with no cycles: acyclic
- Directed Acyclic Graph: DAG
- Undirected forest:
- Acyclic undirected graph

- Tree: undirected acyclic connected graph
- one connected component

- Adjacency matrix:
- When graph is dense
- |E| close to |V|2

- Adjacency lists:
- When graph is sparse
- |E| << |V|2

- Matrix of size |V| x |V|
- Each row (column) j correspond to a distinct vertex j

- “1” in cell < i, j > if there is exists an edge <i,j>
- Otherwise, “0”
- In an undirected graph, “1” in <i,j> => “1” in <j,i>
- “1” in <j,j> means there’s a self-loop in vertex j

1

1

2

1 2 3

1 0 0 1

2 0 1 0

3 1 1 0

3

2

3

4

1 2 3 4

1 0 1 1 0

2 1 0 0 0

3 1 0 0 0

4 0 0 0 0

- Storage complexity: O(|V|2)
- But can use bit-vector representation

- Undirected graph: symmetric along main diagonal
- AT transpose of A
- Undirected: A=AT

- In-degree of X: Sum along column X O(|V|)
- Out-degree of X: Sum along row X O(|V|)
- Very simple, good for small graphs
- Edge existence query: O(1)

- Many graphs in practical problems are sparse
- Not many edges --- not all pairs x,y have edge xy

- Matrix representation demands too much memory
- We want to reduce memory footprint
- Use sparse matrix techniques

- An array Adj[ ] of size |V|
- Each cell holds a list for associated vertex
- Adj[u] is list of all vertices adjacent to u
- List does not have to be sorted
Undirected graphs:

- List does not have to be sorted
- Each edge is represented twice

1

1

2

1 3

2 2

3 1 2

3

2

3

4

1 2 3

2 1

3 1

4

- Storage Complexity:
- O(|V| + |E|)
- In undirected graph: O(|V|+2*|E|) = O(|V|+|E|)

- Edge query check:
- O(|V|) in worst case

- Degree of node X:
- Out degree: Length of Adj[X] O(|V|) calculation
- In degree: Check all Adj[] lists O(|V|+|E|)
- Can be done in O(1) with some auxiliary information!

שאלות?

- We have covered some of these with binary trees
- Breadth-first search (BFS)
- Depth-first search (DFS)

- A traversal (search):
- An algorithm for systematically exploring a graph
- Visiting (all) vertices
- Until finding a goal vertex or until no more vertices
Only for connected graphs

- One of the simplest algorithms
- Also one of the most important
- It forms the basis for MANY graph algorithms

- Given a starting vertex s
- Visit all vertices at increasing distance from s
- Visit all vertices at distance k from s
- Then visit all vertices at distance k+1 from s
- Then ….

5

2

1

3

8

6

10

7

9

BFS: visit all siblings before their descendents

5 2 8 1 3 6 10 7 9

- q new queue
- enqueue(q, t)
- while (not empty(q))
- curr dequeue(q)
- visit curr // e.g., print curr.datum
- enqueue(q, curr.left)
- enqueue(q, curr.right)
This version for binary trees only!

- This version assumes vertices have two children
- left, right
- This is trivial to fix

- But still no good for general graphs
- It does not handle cycles
Example.

A

B

G

C

E

D

F

Queue: A

Start with A. Put in the queue (marked red)

A

B

G

C

E

D

F

Queue: A B E

B and E are next

A

B

G

C

E

D

F

Queue: A B E C G D F

When we go to B, we put G and C in the queue

When we go to E, we put D and F in the queue

A

B

G

C

E

D

F

Queue: A B E C G D F

When we go to B, we put G and C in the queue

When we go to E, we put D and F in the queue

A

B

G

C

E

D

F

Queue: A B EC G D F F

Suppose we now want to expand C.

We put F in the queue again!

- Cycles:
- We need to save auxiliary information
- Each node needs to be marked
- Visited: No need to be put on queue
- Not visited:Put on queue when found
What about assuming only two children vertices?

- Need to put all adjacent vertices in queue

- unmark all vertices in G
- q new queue
- mark s
- enqueue(q, s)
- while (not empty(q))
- curr dequeue(q)
- visit curr // e.g., print its data
- for each edge <curr, V>
- if V is unmarked
- mark V
- enqueue(q, V)

- Each vertex can be in one of three states:
- Unmarked and not on queue
- Marked and on queue
- Marked and off queue

- The algorithm moves vertices between these states

- Unmarked and not on queue:
- Not reached yet

- Marked and on queue:
- Known, but adjacent vertices not visited yet (possibly)

- Marked and off queue:
- Known, all adjacent vertices on queue or done with

A

B

G

C

E

D

F

Queue: A

Start with A. Mark it.

A

B

G

C

E

D

F

Queue: A B E

Expand A’s adjacent vertices.

Mark them and put them in queue.

A

B

G

C

E

D

F

Queue: AB E C G

Now take B off queue, and queue its neighbors.

A

B

G

C

E

D

F

Queue: ABE C G D F

Do same with E.

A

B

G

C

E

D

F

Queue: ABEC G D F

Visit C.

Its neighbor F is already marked, so not queued.

A

B

G

C

E

D

F

Queue: ABECG D F

Visit G.

A

B

G

C

E

D

F

Queue: ABECGD F

Visit D. F, E marked so not queued.

A

B

G

C

E

D

F

Queue: ABECGDF

Visit F.

E, D, C marked, so not queued again.

A

B

G

C

E

D

F

Queue: ABECGDF

Done. We have explored the graph in order:

A B E C G D F.

- Complexity: O(|V| + |E|)
- All vertices put on queue exactly once
- For each vertex on queue, we expand its edges
- In other words, we traverse all edges once

- BFS finds shortest path from s to each vertex
- Shortest in terms of number of edges
- Why does this work?

- Again, a simple and powerful algorithm
- Given a starting vertex s
- Pick an adjacent vertex, visit it.
- Then visit one of its adjacent vertices
- …..
- Until impossible, then backtrack, visit another

- mark s
- visit s // e.g., print its data
- for each edge <s, V>
- if V is not marked
- DFS(G, V)

A

B

G

C

E

D

F

Current vertex: A

Start with A. Mark it.

A

B

G

C

E

D

F

Current: B

Expand A’s adjacent vertices. Pick one (B).

Mark it and re-visit.

A

B

G

C

E

D

F

Current: C

Now expand B, and visit its neighbor, C.

A

B

G

C

E

D

F

Current: F

Visit F.

Pick one of its neighbors, E.

A

B

G

C

E

D

F

Current: E

E’s adjacent vertices are A, D and F.

A and F are marked, so pick D.

A

B

G

C

E

D

F

Current: D

Visit D. No new vertices available. Backtrack to

E. Backtrack to F. Backtrack to C. Backtrack to B

A

B

G

C

E

D

F

Current: G

Visit G. No new vertices from here. Backtrack to

B. Backtrack to A. E already marked so no new.

A

B

G

C

E

D

F

Current:

1

Done. We have explored the graph in order:

A B C F E D G

2

5

6

3

7

4

- Complexity: O(|V| + |E|)
- All vertices visited once, then marked
- For each vertex on queue, we examine all edges
- In other words, we traverse all edges once

- DFS does not necessarily find shortest path
- Why?