490 likes | 625 Views
In Week 12 of the CS322 course, we explored essential concepts in graph theory, focusing on spanning trees and their applications. We started with a logical warm-up problem involving two vegetarians and two cannibals crossing a river. We then discussed how to convert graphs into spanning trees to minimize routes while maintaining connectivity between cities like Detroit and Minneapolis. We covered Kruskal's and Prim's algorithms for finding minimal spanning trees on weighted graphs, illustrating these concepts with examples. Understanding these principles is crucial for efficient algorithm design in computer science.**
E N D
Week 12 - Monday CS322
Last time • What did we talk about last time? • Trees • Graphing functions
Logical warmup • Two vegetarians and two cannibals are on one bank of a river • They have a boat that can hold at most two people • Come up with a sequence of boat loads that will convey all four people safely to the other side of the river • The cannibals on any given bank cannot outnumber the vegetarians…. or else!
Spanning Trees Which we somehow seemed to skip earlier…
Turning graphs into trees • Consider the following graph that shows all the routes an airline has between cities Detroit Minneapolis Milwaukee Chicago Cincinnati Louisville St. Louis Nashville
Turning graphs into trees • What if we want to remove routes (to save money)? • How can we keep all cities connected? Detroit Minneapolis Milwaukee Chicago Cincinnati Louisville St. Louis Nashville
The best tree? • Does this tree have the smallest number of routes? • Why? Detroit Minneapolis Milwaukee Chicago Cincinnati Louisville St. Louis Nashville
Spanning trees • A spanning tree for a graph G is a subgraph of G that contains every vertex of G and is a tree • Some properties: • Every connected graph has a spanning tree • Why? • Any two spanning trees for a graph have the same number of edges • Why?
Weighted graphs • In computer science, we often talk about weighted graphs when tackling practical applications • A weighted graph is a graph for which each edge has a real number weight • The sum of the weights of all the edges is the total weight of the graph • Notation: If e is an edge in graph G, then w(e) is the weight of e and w(G) is the total weight of G • A minimum spanning tree (MST) is a spanning tree of lowest possible total weight
Weighted graphs example • Here is the graph from before, with labeled weights Detroit Minneapolis 355 Milwaukee 74 230 306 695 348 Chicago Cincinnati 269 262 83 242 Louisville St. Louis 151 Nashville
Finding a minimum spanning tree • Kruskal's algorithm gives an easy to follow technique for finding an MST on a weighted, connected graph • Informally, go through the edges, adding the smallest one, unless it forms a circuit • Algorithm: • Input: Graph G with n vertices • Create a subgraphT with all the vertices of G (but no edges) • Let E be the set of all edges in G • Set m = 0 • While m < n – 1 • Find an edge e in E of least weight • Delete e from E • If adding e to T doesn't make a circuit • Add e to T • Set m = m + 1 • Output: T
Minimum spanning tree example • Run Kruskal's algorithm on the city graph: Detroit Minneapolis 355 Milwaukee 74 230 306 695 348 Chicago Cincinnati 269 262 83 242 Louisville St. Louis 151 Nashville
Minimum spanning tree output • Output: Detroit Minneapolis 355 Milwaukee 74 230 Chicago Cincinnati 262 83 242 Louisville St. Louis 151 Nashville
Prim's algorithm • Prim's algorithm gives another way to find an MST • Informally, start at a vertex and add the next closest node not already in the MST • Algorithm: • Input: Graph G with n vertices • Let subgraphT contain a single vertex v from G • Let V be the set of all vertices in G except for v • For i from 1 to n – 1 • Find an edge e in G such that: • e connects T to one of the vertices in V • e has the lowest weight of all such edges • Let w be the endpoint of e in V • Add e and w to T • Delete w from V • Output: T
Prim fights Kruskal • Apply Kruskal's algorithm to the graph below • Now, apply Prim's algorithm to the graph below • Is there any other MST we could make? 3 2 1 4 1 3 1
Discontinuous functions • Recall the definition of the floor of x: • x = the largest integer that is less than or equal to x • Graph f(x) = x • Defining functions on integers instead of real values affects their graphs a great deal • Graph p1(x) = x, x R • Graph f(n) = n, n N
Multiples of functions • There is a strong visual (and of course mathematical) correlation a function that is the multiple of another function • Examples: • g(x) = x + 2 • 2g(x) = 2x + 4 • Given f graphed below, sketch 2f 2 1 -6 -5 -4 -3 -2 -1 -1 -2 1 2 3 4 5 6
Absolute value • Consider the absolute value function • f(x) = |x| • Left of the origin it is constantly decreasing • Right of the origin it is constantly increasing 2 1 -6 -5 -4 -3 -2 -1 -1 -2 1 2 3 4 5 6
Increasing and decreasing functions • We say that f is decreasing on the set Siff for all real numbers x1 and x2 in S, if x1 < x2, then f(x1) > f(x2) • We say that f is increasing on the set Siff for all real numbers x1 and x2 in S, if x1 < x2, then f(x1) < f(x2) • We say that f is an increasing (or decreasing) function ifff is increasing (or decreasing) on its entire domain • Clearly, a positive multiple of an increasing function is increasing • Virtually all running time functions are increasing functions
Asymptotic Notation (Big Oh) Student Lecture
Growth of functions • Mathematicians worry about the growth of various functions • They usually express such things in terms of limits, maybe with derivatives • We are focused primarily on functions that bound running time spent and memory consumed • We just need a rough guide • We want to know the order of the growth
Definitions • Let f and g be real-valued functions defined on the same set of nonnegative real numbers • f is of order at least g, written f(x) is(g(x)), iff there is a positive A R and a nonnegative a R such that • A|g(x)| ≤ |f(x)| for all x > a • f is of order at most g, written f(x) isO(g(x)), iff there is a positive B R and a nonnegative b R such that • |f(x)| ≤ B|g(x)| for all x > b • f is of order g, written f(x) is(g(x)), iff there are positive A, B R and a nonnegative k R such that • A|g(x)| ≤ |f(x)| ≤ B|g(x)| for all x > k
Using the notation • Express the following statements using appropriate notation: • 10|x6| ≤ |17x6 – 45x3 + 2x + 8| ≤ 30|x6|, for x > 2 • Justify the following: • is (x)
Properties of -, O-, and - notation • Let f and g be real-valued functions defined on the same set of nonnegative real numbers • f(x) is (g(x)) and f(x) is O(g(x)) ifff(x) is (g(x)) • f(x) is (g(x)) iffg(x) is O(f(x)) • f(x) is (f(x)), f(x) is O(f(x)), and f(x) is (f(x)) • If f(x) is O(g(x)) and g(x) is O(h(x)) then f(x) is O(h(x)) • If f(x) is O(g(x)) and c is a positive real, then cf(x) is O(g(x)) • If f(x) is O(h(x)) and g(x) is O(k(x)) then f(x) + g(x) is O(G(x)) where G(x) = max(|h(x)|,|k(x)|) for all x • If f(x) is O(h(x)) and g(x) is O(k(x)) then f(x)g(x) is O(h(x)k(x))
Orders of functions • If 1 < x, then • x < x2 • x2 < x3 • … • So, for r, sR, where r < s and x > 1, • xr < xs • By extension, xr is O(xs)
Proving bounds • Prove a bound for g(x) = (1/4)(x – 1)(x + 1) for x R • Prove that x2 is not O(x) • Hint: Proof by contradiction
Polynomials • Let f(x) be a polynomial with degree n • f(x) = anxn + an-1xn-1 + an-2xn-2 … + a1x + a0 • By extension from the previous results, if an is a positive real, then • f(x) is O(xs) for all integers s n • f(x) is (xr) for all integers r≤n • f(x) is (xn) • Furthermore, let g(x) be a polynomial with degree m • g(x) = bmxm + bm-1xm-1 + bm-2xm-2 … + b1x + b0 • If an and bm are positive reals, then • f(x)/g(x) is O(xc) for real numbers c> n - m • f(x)/g(x) is not O(xc) for all integers c > n -m • f(x)/g(x) is (xn- m)
Extending notation to algorithms • We can easily extend our -, O-, and - notations to analyzing the running time of algorithms • Imagine that an algorithm A is composed of some number of elementary operations (usually arithmetic, storing variables, etc.) • We can imagine that the running time is tied purely to the number of operations • This is, of course, a lie • Not all operations take the same amount of time • Even the same operation takes different amounts of time depending on caching, disk access, etc.
Running time • First, assume that the number of operations performed by A on input size n is dependent only on n, not the values of the data • If f(n) is (g(n)), we say that Ais(g(n)) or that A is of order g(n) • If the number of operations depends not only on n but also on the values of the data • Let b(n) be the minimum number of operations where b(n) is (g(n)), then we say that in the best case, Ais(g(n)) or that A has a best case order of g(n) • Let w(n) be the maximum number of operations where w(n) is (g(n)), then we say that in the worst case, Ais(g(n)) or that A has a worst case order of g(n)
Computing running time • With a single for (or other) loop, we simply count the number of operations that must be performed: int p = 0; int x = 2; for( inti = 2; i <= n; i++ ) p = (p + i)*x; • Counting multiplies and adds, (n – 1) iterations times 2 operations = 2n – 2 • As a polynomial, 2n – 2 is (n)
Nested loops • When loops do not depend on each other, we can simply multiply their iterations (and asymptotic bounds) int p = 0; for( inti = 2; i <= n; i++ ) for( int j = 3; j <= n; j++ ) p++; • Clearly (n – 1)(n -2) is (n2)
Trickier nested loops • When loops depend on each other, we have to do more analysis int s = 0; for( inti = 1; i <= n; i++ ) for( int j = 1; j <= i; j++ ) s = s + j*(i – j + 1); • What's the running time here? • Arithmetic sequence saves the day (for the millionth time)
Iterations with floor • When loops depend on floor, what happens to the running time? int a = 0; for( inti = n/2; i <= n; i++ ) a = n - i; • Floor is used implicitly here, because we are using integer division • What's the running time? Hint: Consider n as odd or as even separately
Sequential search • Consider a basic sequential search algorithm: int search( int[]array, int n, int value) { for( inti = 0; i < n; i++ ) if( array[i] == value ) returni; return -1; } • What's its best case running time? • What's its worst case running time? • What's its average case running time?
Insertion sort algorithm • Insertion sort is a common introductory sort • It is suboptimal, but it is one of the fastest ways to sort a small list (10 elements or fewer) • The idea is to sort initial segments of an array, insert new elements in the right place as they are found • So, for each new item, keep moving it up until the element above it is too small (or we hit the top)
Insertion sort in code public static void sort( int[]array, int n) { for( inti = 1; i < n; i++ ) { intnext = array[i]; int j = i - 1; while( j != 0 && array[j] > next ) { array[j+1] = array[j]; j--; } array[j] = next; } }
Best case analysis of insertion sort • What is the best case analysis of insertion sort? • Hint: Imagine the array is already sorted
Worst case analysis of insertion sort • What is the worst case analysis of insertion sort? • Hint: Imagine the array is sorted in reverse order
Average case analysis of insertion sort • What is the average case analysis of insertion sort? • Much harder than the previous two! • Let's look at it recursively • Let Ek be the average number of comparisons needed to sort k elements • Ek can be computed as the sum of the average number of comparisons needed to sort k – 1 elements plus the average number of comparisons (x) needed to insert the kth element in the right place • Ek = Ek-1 + x
Finding x • We can employ the idea of expected value from probability • There are k possible locations for the element to go • We assume that any of these k locations is equally likely • For each turn of the loop, there are 2 comparisons to do • There could be 1, 2, 3, … up to k turns of the loop • Thus, weighting each possible number of iterations evenly gives us
Finishing the analysis • Having found x, our recurrence relation is: • Ek = Ek-1 + k + 1 • Sorting one element takes no time, so E1 = 0 • Solve this recurrence relation! • Well, if you really banged away at it, you might find: • En = (1/2)(n2 + 3n – 4) • By the polynomial rules, this is (n2) and so the average case running time is the same as the worst case
Next time… • Finish algorithmic efficiency • Exponential and logarithmic functions
Reminders • Keep reading Chapter 11 • Keep working on Assignment 9 • Due Friday before midnight • Exam 3 is next Monday • Review on Friday