330 likes | 354 Views
CSE 326: Data Structures: Sorting. Lecture 17: Wednesday, Feb 19, 2003. Today. Bucket sort (review) Radix sort Merge sort for external memory. Lower Bound. Recall: Any sorting algorithm based on comparisons requires (n log n) time
E N D
CSE 326: Data Structures: Sorting Lecture 17: Wednesday, Feb 19, 2003
Today • Bucket sort (review) • Radix sort • Merge sort for external memory
Lower Bound Recall: • Any sorting algorithm based on comparisons requires (n log n) time • More precisely: there exists a “bad input” for that algorithm, on which it takes (n log n) • The theorem can be extended: the average running time is (n log n) • The next two algorithms (Bucket, Radix) apparently break this theorem !
Bucket Sort • Now let’s sort in O(N) • Assume:A[0], A[1], …, A[N-1] {0, 1, …, M-1}M = not too big • Example: sort 1,000,000 person records on the first character of their last names: • Hence M = 128 (in practice: M = 27)
Bucket Sort int bucketSort(Array A, int N) { for k = 0 to M-1 Q[k] = new Queue; for j = 0 to N-1 Q[A[j]].enqueue(A[j]); Result = new Queue; for k = 0 to M-1 Result = Result.append(Q[k]); return Result; } Stablesorting !
Bucket Sort • Running time: O(M+N) • Space: O(M+N) • Recall that M << N, hence time = O(N) • What about the Theorem that says sorting takes (N log N) ?? This is not based on key comparisons, insteadexploits the fact that keysare small
Radix Sort • I still want to sort in time O(N): non-trivial keys • A[0], A[1], …, A[N-1] are strings • Very common in practice • Each string is:cd-1cd-2…c1c0, where c0, c1, …, cd-1{0, 1, …, M-1}M = 128 • Other example: decimal numbers
RadixSort • Radix = “The base of a number system” (Webster’s dictionary) • alternate terminology: radix is number of bits needed to represent 0 to base-1; can say “base 8” or “radix 3” • Used in 1890 U.S. census by Hollerith • Idea: BucketSort on each digit, bottom up.
The Magic of RadixSort • Input list: 126, 328, 636, 341, 416, 131, 328 • BucketSort on lower digit:341, 131, 126, 636, 416, 328, 328 • BucketSort result on next-higher digit:416, 126, 328, 328, 131, 636, 341 • BucketSort that result on highest digit:126, 131, 328, 328, 341, 416, 636
Inductive Proof that RadixSort Works • Keys: d-digit numbers, base B • (that wasn’t hard!) • Claim: after ith BucketSort, least significant i digits are sorted. • Base case: i=0. 0 digits are sorted. • Inductive step: Assume for i, prove for i+1. Consider two numbers: X, Y. Say Xi is ith digit of X: • Xi+1< Yi+1 then i+1th BucketSort will put them in order • Xi+1> Yi+1 , same thing • Xi+1= Yi+1 , order depends on last i digits. Induction hypothesis says already sorted for these digits because BucketSort is stable
Radix Sort int radixSort(Array A, int N) { for k = 0 to d-1 A = bucketSort(A, on position k) } Running time: T = O(d(M+N)) = O(dN) = O(Size)
Radix Sort A= A= A=
Running time of Radixsort • N items, d digit keys of max value M • How many passes? • How much work per pass? • Total time?
Running time of Radixsort • N items, d digit keys of max value M • How many passes? d • How much work per pass? N + M • just in case M>N, need to account for time to empty out buckets between passes • Total time? O( d(N+M) )
Radix Sort • What is the size of the input ? Size = dN • Radix sort takes time O(Size) !!
Radix Sort • Variable length strings: • Can adapt Radix Sort to sort in time O(Size) ! • What about our Theorem ??
Radix Sort • Suppose we want to sort N distinct numbers • Represent them in decimal: • Need d=log N digits • Hence RadixSort takes time O(Size) = O(dN) = O(N log N) • The total Size of N keys is O(N log N) ! • No conflict with theory
Sorting HUGE Data Sets • US Telephone Directory: • 300,000,000 records • 64-bytes per record • Name: 32 characters • Address: 54 characters • Telephone number: 10 characters • About 2 gigabytes of data • Sort this on a machine with 128 MB RAM… • Other examples?
Merge Sort Good for Something! • Basis for most external sorting routines • Can sort any number of records using a tiny amount of main memory • in extreme case, only need to keep 2 records in memory at any one time!
External MergeSort • Split input into two “tapes” (or areas of disk) • Merge tapes so that each group of 2 records is sorted • Split again • Merge tapes so that each group of 4 records is sorted • Repeat until data entirely sorted log N passes
Sorting • Illustrates the difference in algorithm design when your data is not in main memory: • Problem: sort 8Gb of data with 8Mb of RAM • We know we can do it in O(n log n) time, but let’s see the number of disk I/O’s
2-Way Merge-sort:Requires 3 Buffers • Pass 1: Read a page, sort it, write it. • only one buffer page is used • Pass 2, 3, …, etc.: • three buffer pages used. Buffer size = 8Kb (typically) INPUT 1 OUTPUT INPUT 2 Main memory buffers Disk Disk
2-Way Merge-sort • A run = a sequence of sorted elements • Main property of 2-way merge: • If the minimum run length is L, then after merge the minimum run length is 2L • Initially: minimum run length = 1 (why ?) • After one pass = 2 • After two passes = 4 • . . .
Two-Way External Merge Sort 6,2 2 Input file 9,4 8,7 5,6 3,1 3,4 PASS 0 1,3 2 1-page runs 2,6 4,9 7,8 5,6 3,4 • Each pass we read + write each page in file. • N pages in the file => the number of passes • So total cost is: • Improvement: start with larger runs • Sort 1GB with 1MB memory in 10 passes PASS 1 4,7 1,3 2,3 2-page runs 8,9 5,6 2 4,6 PASS 2 2,3 4,4 1,2 4-page runs 6,7 3,5 6 8,9 PASS 3 1,2 2,3 3,4 8-page runs 4,5 6,6 7,8 9
2-Way Merge-sort • Hence we need exactly log N passes through the entire data • How much is N ? N 106 (why ?) • Hence we need to read and write the entire 8GB data log(106) = 20 times ! • It takes about 1minute to read 1GB of data • even more • It takes at least160 minutes to sort = 3 hours
2-Way Merge-sort: less dumb • Use the 8Mb of main memory better ! • Initial step: Run formation • Read 8Mb of data in main memory • Sort (what algorithm would you use ?) • Write to disk • Now the runs are 8Mb after one pass ! • After subsequent passes the runs are • 2 8Mb, 4 8Mb, . . . • Need log(8Gb/8Mb) = log(103) = 10 passes • 1.5h instead of 3h have time to see a movie
Can We Do Better ? • We have more main memory • Used it during all passes • Multiway merge: • Given M sorted sequence • Merge them
Multiway Merge m At each step:select the smallestvalue among the M,store in the output What datastructure shouldwe use here ?
Multiway Merge-Sort • Phase one: load M bytes in memory, sort • Result: runs of length M/B blocks M/B blocks . . . . . . Disk Disk M bytes of main memory B = size of one block = 8Kb typically
Pass Two • Merge m = M/B runs into a new run • Runs have M/B (M/B – 1) (M/B)2 blocks Input 1 . . . . . . Input 2 Output . . . . Input M/B Disk Disk M bytes of main memory
Pass Three • Merge M/B – 1 runs into a new run • Runs have now M/R (M/B – 1)2 (M/B)3 blocks Input 1 . . . . . . Input 2 Output . . . . Input M/B Disk Disk M bytes of main memory
Multiway Merge-Sort • Input file has N bytes • Need logM/B (N/B) = (log(N/B)) / (log(M/B)) complete passes over the file • N/B = 8Gb / 8Kb = 106 • M/B = 8Mb / 8Kb = 103 • Hence need log(106)/log(103) = 2 passes ! • 2 minutes ! Time for two movies
Multiway Merge-Sort • With today’s main memories, we can sort almost any file in two passes • The file can have (M/B)2 blocks • XML Toolkit: • xsort sorts using multiway merge, in two passes