html5-img
1 / 35

CS 177

Week 13: Searching and Sorting. CS 177. Searching for a number. Lets say that I give you a list of numbers, and I ask you, “Is 37 on this list?” As a human, you have no problem answering this question, as long as the list is reasonably short

durin
Download Presentation

CS 177

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Week 13: Searching and Sorting CS 177

  2. Searching for a number • Lets say that I give you a list of numbers, and I ask you, “Is 37 on this list?” • As a human, you have no problem answering this question, as long as the list is reasonably short • What if the list is an array, and I want you to write a Java program to find some number?

  3. Search algorithm • Easy! • We just look through every element in the array until we find it or run out • If we find it, we return the index, otherwise we return -1 public static intfind(int[] array, intnumber ) { for( inti = 0; i < array.length; i++ ) if( array[i] == number ) return i; return -1; }

  4. How long does it take? • We talked about Big Oh notation last week • Now we have some way to measure how long this algorithm takes • How long, if n is the length of the array? • O(n) time because we have to look through every element in the array, in the worst case

  5. Can we do better? • Is there any way to go smaller than O(n)? • What complexity classes even exist that are smaller than O(n)? • O(1) • O(log n) • Well, on average, we only need to check half the numbers, that’s ½ n which is still O(n) • Darn…

  6. We can’t do better unless… • We can do better with more information • For example, if the list is sorted, then we can use that information somehow • How? • We can play a High-Low game

  7. Binary search • Repeatedly divide the search space in half • We’re looking for 37, let’s say 54 37 23 31 Check the middle Check the middle Check the middle Check the middle (Too low) (Too low) (Found it!) (Too high)

  8. So, is that faster than linear search? • How long can it take? • What if you never find what you’re looking for? • Well, then, you’ve narrowed it down to a single spot in the array that doesn’t have what you want • And what’s the maximum amount of time that could have taken?

  9. Running time for binary search • We cut the search space in half every time • At worst, we keep cutting n in half until we get 1 • The running time is O(log n) • For 64 items log n = 6, for 128 items log n = 7, for 256 items log n = 8, for 512 items log n = 9, ….

  10. Guessing game • We can apply this idea to a guessing game • First we tell the computer that we are going to guess a number between 1 and n • We guess, and it tries to narrow down the number • It should only take log n tries • log2(1,000,000) is only about 20

  11. Interview question • This is a classic interview question asked by Microsoft, Amazon, and similar companies • Imagine that you have 9 red balls • One of them is just slightly heavier than the others, but so slightly that you can’t feel it • You have a very accurate two pan balance you can use to compare balls • Find the heaviest ball in the smallest number of weighings

  12. What’s the smallest possible number? • It’s got to be 8 or fewer • We could easily test one ball against every other ball • There must be some cleverer way to divide them up • Something that is related somehow to binary search

  13. That’s it! • We can divide the balls in half each time • If those all balance, it must be the one we left out to begin with

  14. Nope, we can do better • How? • They key is that you can actually cut the number of balls into three parts each time • We weigh 3 against 3, if they balance, then we know the 3 left out have the heavy ball • When it’s down to 3, weigh 1 against 1, again knowing that it’s the one left out that’s heavy if they balance

  15. Thinking outside the box, er, ball • The cool thing is that we are trisecting the search space each time • This means that it takes log3nweighings to find the heaviest ball • We can do 8 balls in 2 weighings, 27 balls in 3 weighings, 81 balls in 4 weighings, etc.

  16. Sorting • Searching is really useful • The idea of O(log n) time makes all sorts of real world applications work • Google, for example • But, we can’t do binary search unless our list is sorted • Like searching, computer scientists have devoted a lot of thought to figuring out the best way to do sorting

  17. Sorting • The importance of sorting should be evident to you by now • Applications: • Sorting a column in Excel • Organizing your iTunes playlists by artist name • Ranking a high school graduating class • Finding a median score to report on an exam • Countless others…

  18. But, is it interesting? • Yes! • It’s tricky • No, it’s not! Give me 100 names written on 100 index cards and I can sort them, no problem • One way to remind yourself that it’s tricky is by increasing the problem size • What if I gave you 1,000,000 names written on 1,000,000 index cards • You might need some organizational system

  19. Computers are stupid • A computer can’t “jump” to the M section, unless you explicitly create an M section or something • For most common sorts, the computer has to compare two numbers (or Strings or whatever) at a time • Based on that comparison, it has to take another step in the algorithm • Remember, we can swap things around in an array

  20. Bubble sort is a classic sorting algorithm • It is very simple to understand • It is very simple to code • It is not very fast • The idea is simply to go through your array, swapping out of order elements until nothing is out of order

  21. Code for a single pass • One “pass” of the bubble sort algorithm goes through the array once, swapping out of order elements • for( int j = 0; j < array.length - 1; j++ ) • if( array[j] > array[j + 1] ) • { • int temp = array[j]; • array[j] = array[j + 1]; • array[j + 1] = temp; • }

  22. Single pass example • Run through the whole array, swapping any entries that are out of order No swap Swap 0 7 45 0 54 37 108 51 No swap 45 Swap 37 51 No swap 54 108 Swap

  23. How many passes do we need? • How bad could it be? • What if the array was in reverse-sorted order? • One pass would only move the largest number to the bottom • We would need n – 1 passes to sort the whole array 6 6 6 6 6 7 6 5 5 6 5 7 5 5 4 4 4 4 7 5 5 4 4 3 4 3 3 7 3 7 3 2 3 2 3 2 1 2 2 2 7 2 1 1 1 1 1 1 7

  24. Full bubble sort code • The full Java method for bubble sort would require us to have at least n – 1 passes • Alternatively, we could keep a flag to indicate that no swaps were needed on a given pass • for( inti = 0; i < array.length – 1; i++ ) • for( int j = 0; j < array.length - 1; j++ ) • if( array[j] > array[j + 1] ) • { • int temp = array[j]; • array[j] = array[j + 1]; • array[j + 1] = temp; • }

  25. Ascending sort • The bubble sort we saw sorts integers in ascending order • What if you wanted to sort them in descending order? • Only a single change is needed to the inner loop: • for( int j = 0; j < array.length - 1; j++ ) • if( array[j] < array[j + 1] ) • { • int temp = array[j]; • array[j] = array[j + 1]; • array[j + 1] = temp; • }

  26. What’s the running time of bubble sort? • The outer loop runs n – 1 times • The inner loop runs n – 1 times • The inner loop has a constant amount of work inside of it, call it c • (n – 1)(n – 1)c = cn2 – 2cn + c, which is… • O(n2) • Hmm, not great, let’s try another sort

  27. Insertion sort • Instead of “bubbling” down the largest (or smallest) number, keep the first k elements sorted, and keep increasing k • Philosophically, not that different from bubble sorting • The nice thing is that we can stop sorting whenever the new thing we added is in place

  28. Insertion sort code The nice thing is that each inner loop runs at most i times • for( inti = 1; i < array.length; i++ ) • for( int j = i; j > 0; j-- ) //count back • if( array[j - 1] > array[j] ) • { • int temp = array[j]; • array[j] = array[j - 1]; • array[j - 1] = temp; • } • else break;

  29. What’s the running time of insertion sort? • The outer loop runs n – 1 times • Well, each inner loop runs a maximum of i times, where i is the current iteration of the outer loop • 1 + 2 + 3 + … + (n – 1) = ? • = (n)(n – 1)/2 = ½n2 – ½n, which is… • O(n2)

  30. Better than quadratic? • Is there a way to sort things that is better than quadratic time? • Yes! • Merge sort • Keep dividing your list in half, over and over, until you get down to two lists with one element in each • Merge the lists together, sorting them as you do, and merge the sorted list of 2 with another sorted list of 2, then merge lists of 4, and keep going until you have merged everything together • It takes O(n log n), which is the best you can do for a comparison based sort

  31. Bucket sort paradigm • You use bucket sort when you know that your data is in a narrow range, like, the numbers between 1 and 10 or even 1 and 100 • As long as the range of possible values is in the neighborhood of the length of your list, bucket sort can do well • Example:150 students with integer grades between 1 and 100 • Doesn’t work for sorting doubles or Strings

  32. Bucket sort algorithm • Make an array with enough elements to hold every possible value in your range of values • If you need 1 – 100, make an array with length 100 • Sweep through your original list of numbers, when you see a particular value, increment the corresponding index in the value array • To get your final sorted list, sweep through your value array and, for every entry with value k > 0, print its index k times

  33. Bucket sort example • We know our values will be in the range [1,10] • Our example array: • Our values array: • The result: 6 1 2 2 10 2 2 6 1 6 2 6 7 7 10 2 1 3 0 0 0 2 1 0 0 1 1 2 3 4 5 6 7 8 9 10

  34. Bucket sort in code • Here’s bucket sort in code with a range of [min, max]: • int[] values = new int[max - min + 1]; • for(inti = 0; i < array.length; i++ ) • values[array[i] - min]++; • intcount = 0; • for(inti = 0; i < values.length; i++ ) { • for(intj = 0; j < values[i]; j++ ) { • array[count] = i + min; • count++; • } • }

  35. How long does bucket sort take? • It takes O(n) time to scan through the original array • But, now we have to take into account the number of values we expect • So, let’s say we have m possible values • It takes O(m) time to scan back through the value array, with O(n) additional updates to the original array • Time: O(n + m)

More Related