1 / 56

G64ADS Advanced Data Structures

Study advanced data structures, algorithm design, analysis, and implementation techniques. Gain practical skills in efficient algorithm implementation on modern computers.

Download Presentation

G64ADS Advanced Data Structures

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. G64ADSAdvanced Data Structures Guoping Qiu Room C34, CS Building

  2. About this course • Study advanced data structures, algorithm design, analysis and implementation techniques • To obtain advanced knowledge and practical skills in the efficient implementation of algorithms on modern computers.

  3. About this course • Pre-requisites • Good programming experience (this course does not teach programming) • Java, C++, or C

  4. About this course • Timetable • Lecture Friday, 10:00 - 12:00, C60 • Labs Tuesday 9.00-11.00, B52

  5. About this course • Assessments • Final Exam 50% • Continuous assessment 50% • Homework assignments • Programming projects

  6. About this course • References and learning materials • Textbooks • Mark Allen Weiss, Data Stuctures and Algorithm Analysis in Java, 2nd Edition, Addison Wesley, 2007 • Mark Allen Weiss, Data Stuctures and Problem Solving using Java, 3rd Edition, Addison Wesley, 2006 • Robert Sedgewick, Algorithms in C, 3rd Edition, Addison Wesley, 1998 • Robert Sedgewick, Algorithms in C++, 3rd Edition, Addison Wesley, 1998 • Robert Sedgewick, Algorithms in Java, 3rd Edition, Addison Wesley, 2003 • Course web page http://www.cs.nott.ac.uk/~qiu/Teaching/G64ADS Slides, coursework, other materials

  7. Course Overview • Advanced data structures 􀂄 • Trees, hash tables, heaps, disjoint sets, graphs 􀂄 • Algorithm development and analysis 􀂄 • Insert, delete, search, sort • Applications • Implementation in Java, C++, C

  8. Advanced Data Structures • “Why not just use a big array?” • Example problem Search for a number k in a set of N numbers • Solution Store numbers in an array of size N Iterate through array until find k Number of checks Best case: 1 (k=29) Worst case: N (k=25) Average case: N/2 29 20 65 80 21 39 25

  9. Advanced Data Structures • Solution #2 • Store numbers in a binary search tree • Search tree until find k • Number of checks Best case: 1 (k=29) Worst case: log2N (k=25) Average case: (log2N) / 2 29 20 65 21 80 25 39

  10. Analysis • Does it matter? N vs. (log2N)

  11. Analysis • Does it matter? Assume : • N = 1,000,000,000 • 1 billion (Walmart transactions in 100 days) • 1 Ghz processor = 109 cycles per second Solution #1 (10 cycles per check) • Worst case: 1 billion checks = 10 seconds Solution #2 (100 cycles per check) • Worst case: 30 checks = 0.000003 seconds

  12. Advanced Data Structures • Does it matter? • The Message • Appropriate data structures ease design and improve performance • The Challenge • Design appropriate data structure and associated algorithms for a problem • Analyze to show improved performance

  13. Algorithm Analysis

  14. Purpose • Why bother analyzing code; isn’t getting it to work enough? • Estimate time and memory in the average case and worst case • Identify bottlenecks, i.e., where to reduce time • Speed up critical algorithms

  15. Algorithm • Problem • Specifies the desired input-output relationship • Algorithm • Well-defined computational procedure for transforming inputs to outputs • Correct algorithm • Produces the correct output for every possible input in finite time • Solves the problem

  16. Algorithm Analysis • Predict resource utilization of an algorithm • Running time • Memory • Dependent on architecture • Serial • Parallel • Quantum

  17. What to Analyze • Main focus is on running time • Memory/time tradeoff • Simple serial computing model • Single processor, infinite memory

  18. What to Analyze • Running time T(N) • N is typically the size of the input • Sorting? • Multiplying two integers? • Multiplying two matrices? • Traversing a graph? • T(N) measures number of primitive operations performed e.g., addition, multiplication, comparison, assignment

  19. Example

  20. Example • General Rules • Rule 1 – for loop • The running time of a for loop is at most the running time of the statements inside the for loop times the number of iterations

  21. Example • General Rules • Rule 2 –nested loop for (i = 0; i<n;i++) for(j=0; j<n; j++) k++ This fragment is O(N2)

  22. Example • General Rules • Rule 3 – Consecutive statements for (i = 0; i<n;i++) a[i]=0; for (i = 0; i<n;i++) for(j=0; j<n; j++) a[i]+=[a[j]+i+j This fragment is O(N) work followed by O(N2) work -> O(N2)

  23. Example • General Rules • Rule 4 – if/else If (condition) S1 else S2 No more than the running time of the test plus max {S1,S2)

  24. What to Analyze • Worst-case running time Tworst(N) • Average-case running time Tavg(N) • Tavg(N) <= Tworst(N) • Typically analyze worst-case behavior • Average case hard to compute • Worst-case gives guaranteed upper bound

  25. Rate of Growth • Exact expressions for T(N) meaningless and hard to compare • Rate of growth • Asymptotic behavior of T(N) as N gets big • Usually expressed as fastest growing term in T(N), dropping constant coefficients e.g., T(N) = 3N2+ N + 1 → Θ(N2)

  26. Rate of Growth • T(N) = O(f(N)) if there are positive constants c and n0 such that T(N) ≤cf(N) when N ≥n0 • Asymptotic upper bound • “Big-Oh” notation • T(N) = Ω(g(N)) if there are positive constants c and n0 such that T(N) ≥cg(N) when N ≥ n0 • Asymptotic lower bound • “Big-Omega” notation

  27. Rate of Growth • T(N) = Θ(h(N)) if and only if T(N) = O(h(N)) and T(N) = Ω(h(N)) • Asymptotic tight bound • T(N) = o(p(N)) if for all constants c there exists an n0 such that T(N) < cp(N) when N>n0 • i.e., T(N) = o(p(N)) if T(N) = O(p(N)) and T(N) ≠Θ(p(N)) • “Little-oh” notation

  28. Rate of Growth • N2= O(N2) = O(N3) = O(2N) • N2= Ω(1) = Ω(N) = Ω(N2) • N2= Θ(N2) • N2= o(N3) • 2N2+ 1 = Θ(?) • N2+ N = Θ(?)

  29. Rate of Growth • Rule 1: If T1(N) = O(f(N)) and T2(N) = O(g(N)), then • T1(N) + T2(N) = O(f(N) + g(N)) • T1(N) * T2(N) = O(f(N) * g(N)) • Rule 2: If T(N) is a polynomial of degree k, then T(N) = Θ(Nk) • Rule 3: logkN = O(N) for any constant k

  30. Rate of Growth

  31. Example • Maximum Subsequence Sum problem • Given (possibly negative) integers, A1, A2, …, AN, find the maximum value of • e.g, for input -2, 11, -4, 13, -5, -2, the answer is 20 (A2 through A4)

  32. Example • Maximum Subsequence Sum problem • Given (possibly negative) integers, A1, A2, …, AN, find the maximum value of • e.g, for input -2, 11, -4, 13, -5, -2, the answer is 20 (A2 through A4)

  33. Example • Algorithm 1 • Compute each possible subsequence independently MaxSubSum1 (A) maxSum = 0 for i = 1 to N for j = i to N sum = 0 for k = i to j sum = sum + A[k] if (sum > maxSum) then maxSum = sum return maxSum

  34. Example • Algorithm 1 • /* • Cubic maximum contiguous subsequence sum algorithm. • */ • public static int maxSubSum1( int [ ] a ) • { • int maxSum = 0; • for( int i = 0; i < a.length; i++ ) • for( int j = i; j < a.length; j++ ) • { • int thisSum = 0; • for( int k = i; k <= j; k++ ) • thisSum += a[ k ]; • if( thisSum > maxSum ) • maxSum = thisSum; • } • return maxSum; • }

  35. Example • Algorithm 1: Analysis

  36. Example • Algorithm 2 • Note that • No reason to re-compute sum each time MaxSubSum2 (A) maxSum = 0 for i = 1 to N sum = 0 for j = i to N sum = sum + A[j] if (sum > maxSum) then maxSum = sum return maxSum

  37. Example Algorithm 2 • /** • * Quadratic maximum contiguous subsequence sum algorithm. • */ • public static int maxSubSum2( int [ ] a ) • { • int maxSum = 0; • for( int i = 0; i < a.length; i++ ) • { • int thisSum = 0; • for( int j = i; j < a.length; j++ ) • { • thisSum += a[ j ]; • if( thisSum > maxSum ) • maxSum = thisSum; • } • } • return maxSum; • }

  38. Example • Algorithm 2: Analysis

  39. Example • Algorithm 3 • Recursive, divide and conquer • Divide sequence in half • A(1 ... center) and A(center+1 ... N) • Recursively compute MaxSubSum of left half • Recursively compute MaxSubSum of right half • Compute MaxSubSum of sequence constrained to use A(center) and A(center+1) • e.g., <4, -3, 5, -2, -1, 2, 6, -2>

  40. Example • Algorithm 3 MaxSubSum3 (A, i, j) maxSum = 0 if (i = j) then if A[i] > 0 then maxSum = A[i] else k = floor((i+j)/2) maxSumLeft = MaxSubSum3(A,i,k) maxSumRight = MaxSubSum3(A,k+1,j) // compute maxSumThruCenter maxSum = maximum(maxSumLeft,maxSumRight,maxSumThruCenter) return maxSum

  41. Example • Algorithm 3 • /** • * Recursive maximum contiguous subsequence sum algorithm. • * Finds maximum sum in subarray spanning a[left..right]. • * Does not attempt to maintain actual best sequence. • */ • private static int maxSumRec( int [ ] a, int left, int right ) • { • if( left == right ) // Base case • if( a[ left ] > 0 ) • return a[ left ]; • else • return 0;

  42. Example • int center = ( left + right ) / 2; • int maxLeftSum = maxSumRec( a, left, center ); • int maxRightSum = maxSumRec( a, center + 1, right ); • int maxLeftBorderSum = 0, leftBorderSum = 0; • for( int i = center; i >= left; i-- ) • { • leftBorderSum += a[ i ]; • if( leftBorderSum > maxLeftBorderSum ) • maxLeftBorderSum = leftBorderSum; • } • int maxRightBorderSum = 0, rightBorderSum = 0; • for( int i = center + 1; i <= right; i++ ) • { • rightBorderSum += a[ i ]; • if( rightBorderSum > maxRightBorderSum ) • maxRightBorderSum = rightBorderSum; • } • return max3( maxLeftSum, maxRightSum, • maxLeftBorderSum + maxRightBorderSum ); • }

  43. Example • Algorithm 3: Analysis

  44. Example • Algorithm 4 • Observation • Any negative subsequence cannot be a prefix to the maximum sequence • Or, a positive, contiguous subsequence is always worth adding • T(N) = ? MaxSubSum4 (A) maxSum = 0 sum = 0 for j = 1 to N sum = sum + A[j] if (sum > maxSum) then maxSum = sum else if (sum < 0) then sum = 0 return maxSum

  45. Example • Algorithm 4 • / * Linear-time maximum contiguous subsequence sum algorithm*/ • public static int maxSubSum4( int [ ] a ) • { • int maxSum = 0, thisSum = 0; • for( int j = 0; j < a.length; j++ ) • { • thisSum += a[ j ]; • if( thisSum > maxSum ) • maxSum = thisSum; • else if( thisSum < 0 ) • thisSum = 0; • } • return maxSum; • }

  46. MaxSubSum Running Times

  47. MaxSubSum Running Times

  48. MaxSubSum Running Times

  49. Logarithmic Behaviour • T(N) = O(log2 N), usually occurs when • Problem can be halved in constant time • Solutions to sub-problems combined in constant time • Examples • Binary search • Euclid’s algorithm • Exponentiation

  50. Binary Search • Given an integer X and integers A0,A1,…,AN-1, which are presorted and already in memory, find i such that Ai=X, or return i = -1 if X is not in the input • T(N) = O(log2 N) • T(N) = Θ(log2 N) ?

More Related