Improve Run Generation Overlap input,output, and internal CPU work. Reduce the number of runs (equivalently, increase average run length). DISK MEMORY DISK
Internal Quick Sort Use 6 as the pivot (median of 3). Input first, middle, and last blocks first. In-place partitioning. Input blocks from the ends toward the middle. Sort left and right groups recursively. Can begin output as soon as left most block is ready
Alternative Internal Sort Scheme DISK B1B2B3 Partition into 3 buffers.
Steady State Operation Read from disk Write to disk Run generation Synchronization is done when the active input buffer gets empty (the active output buffer will be full at this time).
DISK MEMORY DISK New Strategy Use 2 input and 2 output buffers. Rest of memory is used for a min loser tree. Input 1Input 0Output 0Output 1 Loser Tree
Steady State Operation Read from disk Write to disk Run generation Synchronization is done when the active input buffer gets empty (the active output buffer will be full at this time).
O0O1 I0 I1 Initialize
O0O1 I0 I1
Initialize O0O1 I0 I1
Initialize O0O1 I0 I1
Initialize O0O1 I0 I1
Initialize O0O1 I0 I1
Generate Run O0O1 I0 I
Generate Run O0O1 I0 I
5 O0 2 3 Generate Run O1 I0 I
45 O0 2 3 Generate Run O1 I0 I
5 O O1 I0 I
5 O O1 I0 I Continue With Run 1
O O I0 I Continue With Run 1 1 5
1 O O I0 I Continue With Run
9 1 O O I0 I Continue With Run
9 1 O O I0 I
9 1 O O I0 I Continue With Run 1 2
2 9 1 O O I0 I Continue With Run
2 9 1 O O I0 I Continue With Run
Buffer Reduction We may reduce the number of buffers to 3. At any time, 1 us used to read into, 1 to write from, and the 3 rd both feeds the loser tree and is filled from the tree. The 3 physical buffers rotate through these three roles.
Let k be number of external nodes in loser tree. Run size >= k. Sorted input => 1 run. Reverse of sorted input => n/k runs. Average run size is ~2k.
Memory capacity = m records. Run size using fill memory, sort, and output run scheme = m. Use loser tree scheme. Assume block size is b records. Need memory for 4 buffers (4b records). Loser tree k = m – 4b. Average run size = 2k = 2(m – 4b). 2k >= m when m >= 8b.
Assume b = 100. m k k
Total internal processing time using fill memory, sort, and output run scheme = O((n/m) m log m) = O(n log m). Total internal processing time using loser tree = O(n log k). Loser tree scheme generates runs that differ in their lengths.
4369 Merging Runs Of Different Length