Lecture 7

Lecture 7 CS203

Midterm Exam The midterm exam will be next Wednesday at the regular class time and will be timed to fill the entire class period On Monday, I will administer a practice midterm. The number, format, and difficulty of the questions and the coverage of course topics will be comparable to the actual midterm. The practice exam will not be graded, but I ask you to turn it in anyway so that I can spot check your answers to help me prepare the actual exam. I have not yet written the practice or actual exam, but there will be a large number of short-answer questions, a one-paragraph writing question, and a programming question per exam. One of the short-answer questions from the practice exam will be repeated on the actual exam.

IMPORTANT NOTICE You will need to use a lab computer on the midterm exam. Make sure *today in lab* that you can log on to the network, and familiarize yourself with the version of Eclipse that is installed on the lab computers. My past policy has been to make the programming question on my exams relatively easy. The programming question on this midterm will be more difficult than usual. Make sure you understand all the material in the labs through lab 5, which will be assigned tonight.

More on ArrayLists and Linked Lists package listcompare; public class StopWatch { private long elapsedTime; private long startTime; private booleanisRunning; public StopWatch() { reset(); } public void start() { if (isRunning) { return; } isRunning = true; startTime = System.currentTimeMillis(); } public void stop() { if (!isRunning) { return; } isRunning = false; long endTime = System.currentTimeMillis(); elapsedTime = elapsedTime + endTime - startTime; } public long getElapsedTime() { if (isRunning) { long endTime = System.currentTimeMillis(); return elapsedTime + endTime - startTime; } else { return elapsedTime; } } public void reset() { elapsedTime = 0; isRunning = false; } }

More on ArrayLists and Linked Lists package listcompare; import java.util.List; public class ListComparer { public void integerListSequentialAdds(List<Integer> list, int count) { for (int counter = 0; counter < count; counter++) { list.add(0, counter); } } public void integerListGets(List<Integer> list, List<Integer> indices) { for (int i: indices) { list.get(i); } } }

More on ArrayLists and Linked Lists package listcompare; import java.util.ArrayList; import java.util.LinkedList; import java.util.List; import java.util.Random; public class ListDriver { public static void main(String[] args) { Random r = new Random(); StopWatch s = new StopWatch(); ListComparer l = new ListComparer(); List<Integer> al = new ArrayList<Integer>(); List<Integer> ll = new LinkedList<Integer>(); int count = 100000; intindexCount = 100000; List<Integer> indices = new ArrayList<Integer> (indexCount); for(int counter = 0; counter< indexCount; counter++) indices.add(r.nextInt(count)); l.integerListSequentialAdds(ll, count); l.integerListSequentialAdds(al, count); s.reset(); s.start(); l.integerListGets(ll, indices); s.stop(); System.out.println("Linked List exercise took " + s.getElapsedTime() + " ms"); s.reset(); s.start(); l.integerListGets(al, indices); s.stop(); System.out.println("Array List exercise took " + s.getElapsedTime() + " ms"); } }

Math Ahead! • The rest of this lecture uses a few math principles that you learned in HS but may have forgotten. • Do not worry (too much) if your math background is shaky. We introduce mathematical material at a gentle pace. When I started here, I had not taken a math class in 25 years, and I managed to learn this material. You can too. • On the other hand, if you want to study this material in more detail, you will not be disappointed. You just have to wait until you take CS312.

Summations Summation is the operation of adding a sequence of numbers; the result is their sum or total. Summation is designated with the Greek symbol sigma (∑) Stop at 100 Find the sum Iterate through values of i Start from 1 This summation means "the sum of all integers between 1 and 100 inclusive"

Useful Mathematic Summations

The first summation on the previous slide would usually be expressed this way: For the following values, the value of the summation is 5050, since 100(101)/2 = 10100/2 = 5050: Useful Mathematic Summations

Logarithms • The logarithm of a number is the exponent to which another number, the base, must be raised to yield the number. • Here is the notation: logb(y) = x where b is the base. The parentheses are usually left out in practice. • Examples: • log2 8 = 3 • log1010,000 = 4 • The word "logarithm" is derived (by an early-modern mathematician) from Greek and means roughly "number reasoning." Interestingly (to me, anyway) it is completely unrelated to it anagram "algorithm," which is derived from a Latin version of an Arabic version of the name of the medieval Persian mathematician Kwarizmi

Logarithms • In other fields, log without an stated base is understood to refer to log10 or loge. • In CS, log without further qualification is understood to refer to log2, pronounced "log base 2" or "binary logarithm." The base is not important in comparing algorithms, but, as you will see, the base is almost always 2 when we are calculating the complexity of programming algorithms.

Recurrence Relations • A recurrence relation is a rule by which a sequence is generated • Eg, the sequence 5, 8, 11, 14, 17, 20… • Is described by the recurrence relation a0 = 5 an = an-1 + 3 • Divide-and-conquer algorithms are often described in terms of recurrence relations

Analyzing Binary Search • Binary Search searches an array or list that is *sorted* • In each step, the algorithm compares the search key value with the key value of the middle element of the array. If the keys match, then a matching element has been found and its index, or position, is returned. • Otherwise, if the search key is less than the middle element's key, then the algorithm repeats its action on the sub-array to the left of the middle element or, if the search key is greater, on the sub-array to the right. • If the remaining array at any step to be searched is empty, then the key cannot be found in the array and a special "not found" indication is returned.

Logarithm: Analyzing Binary Search Each iteration in binary search contains a fixed number of operations, denoted by c. Let T(n) denote the time complexity for a binary search on a list of n elements. Assume n is a power of 2; this makes the math simpler and, if it is not true, the difference is trivial. Let k=log n. In other words, n = 2k Since binary search eliminates half of the input after two comparisons, CS-style recurrence relation

Logarithmic Time • Ignoring constants and smaller terms, the complexity of the binary search algorithm is O(logn). An algorithm with the O(logn) time complexity is called a logarithmic algorithm. • The base of the log is 2, but the base does not affect a logarithmic growth rate, so it can be omitted. • The logarithmic algorithm grows slowly as the problem size increases. If you square the input size, you only double the time for the algorithm.

Sorting Sorting is a classic subject in computer science. There are three reasons for studying sorting algorithms. • First, sorting algorithms illustrate many creative approaches to problem solving that can be applied to other problems. • Second, sorting algorithms are good for practicing fundamental programming techniques using selection statements, loops, methods, and arrays. • Third, sorting algorithms are excellent examples to demonstrate algorithm performance.

Sorting These sorting algorithms apply to sorting any type of object, as long as we can find a way to order them. For simplicity, though, we will first sort numeric values, then more complex objects. When we sort more complex numbers, we will sort them by some key. For example, if class Student has instance variables representing CIN, GPA and name, we will sort according to one of these or some combination of them. We have already done this with the priority queue and other examples.

Sorting Arrays and Lists are reference types; that is, the variable we pass between methods do not contain the array or list, but references to them. Therefore, you can sort an array or list in Java with a void method that takes the reference variable, and sorts the elements without returning anything. All other references to the data structure can still be used to access the sorted structure. On the other hand, you can copy all the elements, construct a new sorted list, and return it. This practice may become more common in the near future for reasons you will learn about in a few weeks.

Bubble Sort • Recall that bubble sort repeatedly iterates through the list to be sorted, comparing each pair of adjacent items and swapping them if they are in the wrong order. • This iteration is repeated until no swaps are needed, which indicates that the list is sorted. • The algorithm gets its name from the way smaller elements "bubble" to the top of the list. • Text adapted from Wikipedia

Bubble Sort The number of comparisons is always at least as large as the number of swaps. Therefore, in studying the time complexity, we count the comparisons. The largest key always floats to the right position in the first pass, the next largest rises to the next position in the next pass, etc.

In the best case, the data is already sorted, so that we only need one pass, making bubble sort O(n) in this case. More importantly, though, for both average and worst-case, the number of comparisons is : Bubble Sort Recall that we are estimating the effect of growth in n, not the exact number of CPU cycles we need. We do not care about the division by 2 since this does not affect the rate of growth. As n increases, the lower-order term n/2 is dominated by the n2 term. Therefore, we also disregard the lower order term. Bubble sort time: O(n2)

Selection Sort and Insertion Sort • These two sorts grow sorted sublists one element at a time. • Selection sort finds the lowest value and moves it to the bottom, or finds the largest element and moves it to the top, then repeats the process for the rest of the values, repeatedly until the list is sorted. • Insertion sort takes one value at a time and places it in the correct spot in the sorted sublist, in the same way most people would sort a hand of cards.

Analyzing Selection Sort The number of comparisons in selection sort is n-1 for the first iteration, n-2 for the second iteration, and so on. Let T(n) denote the complexity for selection sort and c denote the total number of other operations such as assignments and additional comparisons in each iteration. So, Ignoring constants and smaller terms, the complexity of selection sort is O(n2).

Analyzing Insertion Sort • Where selection sort always inserts an element at the end of the sorted sublist, selection sort requires inserting some elements in arbitrary places. • At the kthiteration to insert an element to a sorted array of size k, it may take k comparisons to find the insertion position, and k moves to insert the element. Let T(n) denote the complexity for insertion sort and c denote the total number of other operations such as assignments and additional comparisons in each iteration. So, Ignoring constants and smaller terms, the complexity of the insertion sort algorithm is O(n2). These two terms are just twice the values for selection sort

Quadratic Time • An algorithm with O(n2) time complexity is called a quadratic algorithm. • Algorithms with nested loops are often quadratic. • A quadratic algorithm's expense grows quickly as the problem size increases. If you double the input size, the time for the algorithm is quadrupled.

Sorting • We often teach bubble sort, selection sort, and insertion sort first because they are easy to understand. Other sort methods are more efficient in average and worst cases. • In particular, there are sort algorithms with O(n log n) complexity. In other words, the consumption of CPU cycles grows proportionally to n times log of n. These algorithms usually involve performing an operation that is O(log n) n times. • Since log n grows much more slowly than n, this is a dramatic improvement over O(n2): • If n = 2, n2 = 4 and n log n = 2 • If n = 100, n2 = 10000 and n log n = 664 • If n = 10000, n2 = 100,000,000 and n log n = 132,877

Merge Sort Merge Sort is a divide and conquer algorithm, like the Towers Of Hanoi algorithm mergeSort(list): firstHalf = mergeSort(firstHalf); secondHalf = mergeSort(secondHalf); list = merge(firstHalf, secondHalf); merge: add the lesser of firstHalf [0] and secondHalf [0] to the new, larger list repeat until one of the sublists is exhausted add the rest of the remaining sublist to the larger list.

Merge Sort

Merge Two Sorted Lists

Merge Sort Time Assume n is a power of 2. This assumption makes the math simpler. If n is not a power of 2, the difference is trivial. Merge sort splits the list into two sublists, sorts the sublistsusing the same algorithm recursively, and then merges the sublists. Eachrecursive call merge sorts half the list, so the depth of the recursion is log n. The single-item lists are, obviously, sorted. Merge Sort reassembles the list in log n steps, just as it broke the list down. To merge two subarrays, across all the sublists at one level of recursion, takes at most n-1 comparisons to compare the elements from the two subarrays and n moves to move elements to the new array. The total time is 2n-1, which is O(n). This happens log n times. Thus, Merge Sort is O(n log n)

Merge Sort Time Here it is again, but with more math. Let T(n) denote the time required for sorting an array of n elements using merge sort. Without loss of generality, assume n is a power of 2. The merge sort algorithm splits the array into two subarrays, sorts the subarrays using the same algorithm recursively, and then merges the subarrays. So, The first T(n/2) is the time for sorting the first half of the array and the second T(n/2) is the time for sorting the second half.

Merge Sort Time • To merge two subarraystakes at most n-1 comparisons to compare the elements from the two subarrays and n moves to move elements to the temporary array. So, the merge time is 2n-1. 2log n = n and T(1) = 1 2n – 20

Quick Sort Quick sort, developed by C. A. R. Hoare (1962), works as follows: • Select an element, called the pivot, in the array. • Divide the array into two parts such that all the elements in the first part are less than or equal to the pivot and all the elements in the second part are greater than the pivot. • Recursively apply the quick sort algorithm to the first part and then the second part.

Quick Sort function quicksort(a) if length(a) ≤ 1 // an array of zero or one elements is already sorted return a select and remove a pivot element pivot from array create empty lists less and greater for each x in a if x ≤ pivot then append x to less else append x to greater // two recursive calls return concatenate(quicksort(less), list(pivot), quicksort(greater)) The earliest version of quicksort used the first index as the pivot, and demos of quicksort often still do this for simplicity. However, in an already-sorted array, this will cause the worst case O(n2) behavior. The middle index is safer, although yet more complex solutions to this problem also exist.

Quick Sort

package demos; //http://www.mycstutorials.com/articles/sorting/quicksort public class QuickSort { public void quickSort(int array[]) { quickSort(array, 0, array.length - 1); } public void quickSort(int array[], int start, int end) { inti = start; // index of left-to-right scan int k = end; // index of right-to-left scan if (end - start >= 1) // check that there are at least two elements { int pivot = array[start]; // set the pivot as the first element /*System.out.print("pivot " + pivot + ": values "); for(int counter = start; counter <= end; counter++) System.out.print(array[counter] + " "); System.out.println();*/ while (k > i) // while the scan indices have not met { for(int x: array) System.out.print(x + " "); System.out.println(); while (array[i] <= pivot && i <= end && k > i) // from the left, look for the first i++; // element greater than the pivot while (array[k] > pivot && k >= start && k >= i) // from the right, look for the first k--; // element not greater than the pivot if (k > i) // if the left seekindex is still smaller than swap(array, i, k); // the right index, swap the // corresponding elements } swap(array, start, k); // after the indices have crossed, swap the // last element in // the left partition with the pivot quickSort(array, start, k - 1); // quicksort the left partition quickSort(array, k + 1, end); // quicksort the right partition } else // if there is only one element in the partition, do not do any // sorting

{ return; // the array is sorted, so exit } } public void swap(int array[], int index1, int index2) // pre: array is full and index1, index2 < array.length // post: the values at indices 1 and 2 have been swapped { int temp = array[index1]; // store the first value in a temp array[index1] = array[index2]; // copy the value of the second into the // first array[index2] = temp; // copy the value of the temp into the second } public static void main(String[] args){ QuickSort q = new QuickSort(); int[] myArray = { 5, 4, 10, 11, 9, 8, 1}; q.quickSort(myArray); } }

Quick Sort Partition Time To partition an array of nelements takes n-1 comparisons and n moves in the worst case. So, the time required for partition is O(n).

In the worst case, each time the pivot divides the array into one big subarray with the other empty. • The size of the big subarray is one less than the one before divided, so the O(n) partitioning occurs n-1 times. • Worst-case time: Worst-Case Time

In the best case, each time the pivot divides the array into two parts of about the same size, so we partition log2(n) times. • Let T(n) denote the time required for sorting an array of elements using quick sort. So, Best-Case Time

On the average, each time the pivot will not divide the array into two parts of the same size nor one empty part. • Statistically, the sizes of the two parts are very close. So the average time is O(nlogn). The exact average-case analysis depends on the data. Average-Case Time

Bucket Sort • All sort algorithms discussed so far are general sorting algorithms that work for any types of keys (e.g., integers, strings, and any comparable objects). • These algorithms sort the elements by comparing their keys. The lower bound for general sorting algorithms is O(nlogn). So, no sorting algorithms based on comparisons can perform better than O(n log n). • However, if the keys are small integers, you can use bucket sort without having to compare the keys.

Bucket Sort The bucket sort algorithm works as follows. Assume the keys are in the range from 0 to N-1. We need N buckets labeled 0, 1, ..., and N-1. If an element’s key is i, the element is put into the bucket i. Each bucket holds the elements with the same key value. You can use an ArrayList to implement a bucket. Bucket Sort is O(n)

Common Recurrence Relations

Comparing Common Growth Functions Logarithmic time Constant time Linear time Log-linear time Quadratic time Cubic time Exponential time

Comparing Common Growth Functions

Lecture 7

Lecture 7

Presentation Transcript

Lecture 7

Lecture # 7

Lecture 7

Lecture 7

Software Engineering Lecture 7 Lecture # 7

Lecture 7

Lecture # 7

Lecture # 7

Lecture 7

Lecture 7

Lecture 7

Lecture 7

Lecture 7

Lecture 7

Lecture 7

Lecture 7

Software Engineering Lecture 7 Lecture # 7

Lecture 7

LECTURE № 7

Lecture 7

Lecture 7

Lecture 7