Lecture 6 external sorting
Download
1 / 6

Lecture 6 : External Sorting - PowerPoint PPT Presentation


  • 105 Views
  • Uploaded on

Lecture 6 : External Sorting. Bong-Soo Sohn Assistant Professor School of Computer Science and Engineering Chung-Ang University. External Sorting. Sorting algorithm that can handle massive amounts of data (using external memory) Required when data does not fit into main memory

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Lecture 6 : External Sorting' - efuru


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Lecture 6 external sorting

Lecture 6 : External Sorting

Bong-Soo Sohn

Assistant Professor

School of Computer Science and Engineering

Chung-Ang University


External sorting
External Sorting

  • Sorting algorithm that can handle massive amounts of data (using external memory)

  • Required when data does not fit into main memory

  • out-of-core algorithm vs in-core algorithm


Motivation
Motivation

  • Sometimes the data to sort are too large to fit in memory (Why not virtual memory?)

  • Use external memory (disk)

    • Disk performance

      • seek time (major factor)

      • rotational latency

      • Transfer

  • Primary rule for disk access

    • Minimize the number of disk accesses

  • Assume external(secondary) memory is divided into equal sized blocks (ex. 1KB, 4KB, …)

    • Block : unit where data is stored and retrived


External merge sort idea
External Merge Sort : Idea

  • EX) sorting 900MB of data using only 100MB of RAM:

    • Read 100 MB of the data in main memory and sort by some conventional method (usually quicksort).

    • Write the sorted data to disk.

    • Repeat steps 1 and 2 until all of the data is sorted in 100 MB chunks, which now need to be merged into one single output file.

    • Read the first 10 MB of each sorted chunk (call them input buffers) in main memory (90 MB total) and allocate the remaining 10 MB for output buffer.

    • Perform a 9-way merging and store the result in the output buffer. If the output buffer is full, write it to the final sorted file. If any of the 9 input buffers gets empty, fill it with the next 10 MB of its associated 100 MB sorted chunk or otherwise mark it as exhausted if there is no more data in the sorted chunk and do not use it for merging.


2 way merge sort

R1

R2

R3

R4

R5

R6

R7

R8

R9

R10

R11

R12

R13

R14

R15

R16

R17

R18

R19

R20

S10

S1

S2

S3

S4

S5

S6

S7

S8

S9

T4

T5

T1

T2

T3

U3

U2

U1

V2

V1

2-way merge sort

  • # of passes : 5

W1


5 way merge sort

R1

R2

R3

R4

R5

R6

R7

R8

R9

R10

R11

R12

R13

R14

R15

R16

R17

R18

R19

R20

S1

S2

S3

S4

5-way merge sort

T1

  • we can reduce # of passes


ad