On Optimal and Efficient in Place Merging Pok-Son Kim Kookmin University, Department of Mathematics, Seoul 135-702, Korea Arne Kutzner Seokyeong University,

Slides:



Advertisements
Similar presentations
MATH 224 – Discrete Mathematics
Advertisements

Analysis of Algorithms
Analysis of Algorithms CS 477/677 Linear Sorting Instructor: George Bebis ( Chapter 8 )
CSE 3101: Introduction to the Design and Analysis of Algorithms
Sorting Comparison-based algorithm review –You should know most of the algorithms –We will concentrate on their analyses –Special emphasis: Heapsort Lower.
CSE332: Data Abstractions Lecture 14: Beyond Comparison Sorting Dan Grossman Spring 2010.
Lower bound for sorting, radix sort COMP171 Fall 2005.
Fundamentals of Python: From First Programs Through Data Structures
© The McGraw-Hill Companies, Inc., Chapter 2 The Complexity of Algorithms and the Lower Bounds of Problems.
1 Divide-and-Conquer The most-well known algorithm design strategy: 1. Divide instance of problem into two or more smaller instances 2. Solve smaller instances.
Algorithms Recurrences. Definition – a recurrence is an equation or inequality that describes a function in terms of its value on smaller inputs Example.
1 Sorting Problem: Given a sequence of elements, find a permutation such that the resulting sequence is sorted in some order. We have already seen: –Insertion.
CSE332: Data Abstractions Lecture 12: Introduction to Sorting Tyler Robison Summer
CS 253: Algorithms Chapter 8 Sorting in Linear Time Credit: Dr. George Bebis.
Sorting Heapsort Quick review of basic sorting methods Lower bounds for comparison-based methods Non-comparison based sorting.
Comp 122, Spring 2004 Lower Bounds & Sorting in Linear Time.
2 -1 Chapter 2 The Complexity of Algorithms and the Lower Bounds of Problems.
2 -1 Analysis of algorithms Best case: easiest Worst case Average case: hardest.
Iterative Optimization and Simplification of Hierarchical Clusterings Doug Fisher Department of Computer Science, Vanderbilt University Journal of Artificial.
1 Data Structures A program solves a problem. A program solves a problem. A solution consists of: A solution consists of:  a way to organize the data.
CSE 326: Data Structures Sorting Ben Lerner Summer 2007.
Analysis of Algorithms CS 477/677
Chapter 11 Limitations of Algorithm Power Copyright © 2007 Pearson Addison-Wesley. All rights reserved.
1 ES 314 Advanced Programming Lec 2 Sept 3 Goals: Complete the discussion of problem Review of C++ Object-oriented design Arrays and pointers.
Computer Algorithms Lecture 11 Sorting in Linear Time Ch. 8
Sorting in Linear Time Lower bound for comparison-based sorting
1 Time Analysis Analyzing an algorithm = estimating the resources it requires. Time How long will it take to execute? Impossible to find exact value Depends.
A Simple Algorithm for Stable Minimum Storage Merging Pok-Son Kim Kookmin University, Department of Mathematics, Seoul , Korea Arne Kutzner Seokyeong.
Searching and Sorting Gary Wong.
A. Levitin “Introduction to the Design & Analysis of Algorithms,” 3rd ed., Ch. 5 ©2012 Pearson Education, Inc. Upper Saddle River, NJ. All Rights Reserved.
C++ Programming: Program Design Including Data Structures, Fourth Edition Chapter 19: Searching and Sorting Algorithms.
HKOI 2006 Intermediate Training Searching and Sorting 1/4/2006.
Jessie Zhao Course page: 1.
Order Statistics. Order statistics Given an input of n values and an integer i, we wish to find the i’th largest value. There are i-1 elements smaller.
Introduction to Algorithms Jiafen Liu Sept
The Selection Problem. 2 Median and Order Statistics In this section, we will study algorithms for finding the i th smallest element in a set of n elements.
CSE332: Data Abstractions Lecture 14: Beyond Comparison Sorting Dan Grossman Spring 2012.
Analysis of Algorithms CS 477/677
Heapsort. Heapsort is a comparison-based sorting algorithm, and is part of the selection sort family. Although somewhat slower in practice on most machines.
1 Lower Bounds Lower bound: an estimate on a minimum amount of work needed to solve a given problem Examples: b number of comparisons needed to find the.
1 Today’s Material Iterative Sorting Algorithms –Sorting - Definitions –Bubble Sort –Selection Sort –Insertion Sort.
Chapter 18: Searching and Sorting Algorithms. Objectives In this chapter, you will: Learn the various search algorithms Implement sequential and binary.
Searching and Sorting Recursion, Merge-sort, Divide & Conquer, Bucket sort, Radix sort Lecture 5.
UNIT 5.  The related activities of sorting, searching and merging are central to many computer applications.  Sorting and merging provide us with a.
Time Complexity. Solving a computational program Describing the general steps of the solution –Algorithm’s course Use abstract data types and pseudo code.
ALGORITHMS.
COSC 3101A - Design and Analysis of Algorithms 6 Lower Bounds for Sorting Counting / Radix / Bucket Sort Many of these slides are taken from Monica Nicolescu,
1 Computer Algorithms Lecture 8 Sorting Algorithms Some of these slides are courtesy of D. Plaisted, UNC and M. Nicolescu, UNR.
Chapter 9 Sorting. The efficiency of data handling can often be increased if the data are sorted according to some criteria of order. The first step is.
1. Searching The basic characteristics of any searching algorithm is that searching should be efficient, it should have less number of computations involved.
CSE332: Data Abstractions Lecture 12: Introduction to Sorting Dan Grossman Spring 2010.
Lecture 2 Algorithm Analysis Arne Kutzner Hanyang University / Seoul Korea.
Experimental Study on the Five Sort Algorithms You Yang, Ping Yu, Yan Gan School of Computer and Information Science Chongqing Normal University Chongqing,
Data Structures and Algorithms Instructor: Tesfaye Guta [M.Sc.] Haramaya University.
Lecture 5 Algorithm Analysis Arne Kutzner Hanyang University / Seoul Korea.
Lecture 2 Algorithm Analysis
Lower Bounds & Sorting in Linear Time
May 17th – Comparison Sorts
Introduction to Algorithms
Optimal Algorithms Search and Sort.
Bubble Sort Bubble sort is one way to sort an array of numbers. Adjacent values are swapped until the array is completely sorted. This algorithm gets its.
Lecture 5 Algorithm Analysis
Lecture 5 Algorithm Analysis
Lower Bounds & Sorting in Linear Time
Lecture 5 Algorithm Analysis
Chapter 8: Overview Comparison sorts: algorithms that sort sequences by comparing the value of elements Prove that the number of comparison required to.
The Selection Problem.
Lecture 5 Algorithm Analysis
Presentation transcript:

On Optimal and Efficient in Place Merging Pok-Son Kim Kookmin University, Department of Mathematics, Seoul , Korea Arne Kutzner Seokyeong University, Department of E-Business, Seoul , Korea

SOFSEM 2006 On Optimal and Efficient in Place Merging2 Merging Make one sorted array out of two consecutive sorted arrays , 491, 92

SOFSEM 2006 On Optimal and Efficient in Place Merging3 Lower Bounds for Merging Number of comparisons –Argumentation over the decision tree (see Knuth) Number of assignments –Each element can change its position in the final sequence for

SOFSEM 2006 On Optimal and Efficient in Place Merging4 Notions An algorithm merges two adjacent sequences “in place” when it needs constant additional space. Stability: Merging algorithm preserves the initial ordering of elements with equal value.

We present..... …a stable, asymptotically optimal, in place merging algorithm

SOFSEM 2006 On Optimal and Efficient in Place Merging6 Foundation Algorithm of Hwang and Lin [1972] Merging algorithm with the following properties –Asymptotically optimal regarding comparisons where –Two variants External space of size m (not in place) 2m + n assignments External space of size O(1) assignments (not asymptotically optimal)

SOFSEM 2006 On Optimal and Efficient in Place Merging7 Step 1: Reducing the external space from m to vu1u1 u2u2 ulul u0u0 l blocks of size k size m-l*k shorter input sequence u (size m ) Granulation of shorter input sequence into blocks of equal size

SOFSEM 2006 On Optimal and Efficient in Place Merging8 Reducing the external space from m to (cont.) Spilt u i into b i x i, so that x i is the last element of u i for Granulation of v such that (Technically l+1 binary searches) v0v0 b0b0 x0x0 bibi xixi blbl xlxl vivi vlvl v l+1 u0u0 uiui ulul

SOFSEM 2006 On Optimal and Efficient in Place Merging9 l+1 local merges using Hwang and Lin (necessary external space ) Kernel Algorithm v0v0 x0x0 bibi xixi blbl xlxl vivi vlvl v l+1 b0b0 Sorted Sequence v0v0 b0b0 x0x0 bibi xixi blbl xlxl vivi vlvl v l+1 Block Rearrangements

SOFSEM 2006 On Optimal and Efficient in Place Merging10 Block Rearrangements “tricky” technique –Kernel idea: result of Mannilla and Ukkonen [1984] Main characteristics: –Iterative processing, starting with the placement of u 0, v 0 continuing with u 1, v 1 and so on Altogether: assignments –Nasty: “unplaced” u i blocks can be interleaved Therefore repeated search of minimal block necessary. Additional costs: comparisons for repeated search l(7k) ≤ 7m assignments for minimal block extraction

SOFSEM 2006 On Optimal and Efficient in Place Merging11 Overall Complexity of the Kernel Algorithm l+1 calls of Hwang and Lin comparisons assignments where and + l+1 binary searches +Block rearrangements (foregoing slide) = comparisons, O(m+n) assignments

SOFSEM 2006 On Optimal and Efficient in Place Merging12 Step 2: Reducing the external space from to O(1) Kernel Idea: Creation of an “internal buffer” of size –Technique first described by Kronrod [1968] –Created by an initial splitting step –Elements of the internal buffer can be disordered during merging –Finally the elements of the internal buffer are sorted and merged

SOFSEM 2006 On Optimal and Efficient in Place Merging13 Unstable in Place Alg. v2v2 u1u1 u2u2 internal buffer (size ) v1v1 v2v2 u1u1 u2u2 v1v1 Rotation Sorted Sequence u1u1 v1v1 Kernel Alg. ( u 1 is buffer) Sort/Hwang and Lin with external space O(1) Sorted Sequence Binary Search

SOFSEM 2006 On Optimal and Efficient in Place Merging14 Complexity of Unstable in Place Algorithm Lemma: Unstable In Place Alg. is asymptotically optimal regarding number of comparisons and assignments. Proof: Simply count the additional operations –Binary search and Hwang and Lin trivially doesn’t change the asymptotic number of comparisons –Hwang and Lin’s call poses = O(m+n) additional assignments –Insertion sort needs O(m) comparisons as well as assignments

SOFSEM 2006 On Optimal and Efficient in Place Merging15 Deriving a Stable Alg. 2 Reasons for lacking stability –Internal buffer might contain equal elements (the initial order of equal elements can’t be restored by insertion sort) –Two blocks u i and u j ( 0≤i,j≤l, i≠j ) that contain equal elements can’t be distinguished during the search for the minimal block

SOFSEM 2006 On Optimal and Efficient in Place Merging16 Deriving a Stable Alg. (cont) Kernel Idea: Extraction of distinct elements as buffer elements – buffer elements for local merges – buffer elements to keep track of the reordering of the u i -blocks (movement imitation buffer) –Reordering of the buffer elements now doesn't effect stability because all elements are different !

SOFSEM 2006 On Optimal and Efficient in Place Merging17 Partitioning Scheme Here for Every rearrangement of the u i is mirrored in movement imitation buffer Additional counter variable for the number of “already placed” blocks necessary e1e1 v u1u1 e2e2 e3e3 e4e4 u3u3 u4u4 u6u6 u5u5 Movement Imitation Buf. (size ) Buffer for Local Merges (size )

SOFSEM 2006 On Optimal and Efficient in Place Merging18 Deriving a Stable Alg. (cont) Application of the following modifications to the unstable Algorithm: –Initial Buffer extraction (Technique described by Pardo [1977]) –Replacement of search for minimal block by management of Movement Imitation-Buffer –Final merging of sorted buffers slightly different: Sorted SequenceSorted Buffer Hwang and Lin with external space O(1) Sorted Sequence

SOFSEM 2006 On Optimal and Efficient in Place Merging19 Complexity of Stable Algorithm Lemma: Stable in Place Alg. is asymptotically optimal regarding comparisons and assignments. Proof: Check of all modifications applied to the unstable algorithm. –Buffer extraction needs O(m) comparisons and O(m) assignments –Repeated search of the minimal block: –Management of the mi-buffer: –Modified final merging has no impact comparisons assignments

SOFSEM 2006 On Optimal and Efficient in Place Merging20 Special Case - Too few buffer elements - We use a slightly modified version of Hwang and Lin’s Alg. –Instead of directly inserting we first extract maximal segments of equal elements: (maximal segments are found by a linear search) Hwang and Lin applied to single elements Hwang and Lin applied to groups of eq. elements A) B)

SOFSEM 2006 On Optimal and Efficient in Place Merging21 Special Case (cont.) - Too few buffer elements - Effect of modification: We can express the number of assignments depending on the number of different elements in u Modified stable algorithm: v Movement Imitation Buf. (size ) Blocks of (size ) u1u1 u2u2 Modified Hwang and Lin is used for local merges

SOFSEM 2006 On Optimal and Efficient in Place Merging22 Special Case - Complexity - Lemma: Stable Alg. for the case of too few buffer elements is asymptotically optimal regarding assignments and comparisons Proof: Only significant modifications –size of u blocks changed –modified variant of Hwang and Lin.

SOFSEM 2006 On Optimal and Efficient in Place Merging23 Experimental Results Unstable as well as stable Alg. ready for practice! –Impact of time per comparison ! (Here we took integer comparisons) Time(+)#comparisons(-)

SOFSEM 2006 On Optimal and Efficient in Place Merging24 Related Work 3 Papers that present similar results: –Symvonis[1995]: Description of a “may be” algorithm design –Geffert at all [2000]: Complex non-modular algorithm No remarks regarding implementation or benchmarking –Chen [2003]: Slightly simplified version of Geffert’s Alg. No remarks regarding implementation or benchmarking All papers rely on the work of Hwang and Lin, Kronrod as well as Mannilla and Ukkonen

SOFSEM 2006 On Optimal and Efficient in Place Merging25 Conclusion Presentation of an unstable as well as stable merging algorithm –In Place –Asymptotically optimal regarding the number of comparisons as well as assignments Highlights: –Alg. has modular and transparent structure –Alg. was implemented, Kernel part described in Pseudo-Code (in paper) –Experimental Results - Benchmarking –Several detail improvements, e.g. “leaving free” of m elements in Kernel Alg. –Elegant handling (embedding) of the case of too few buffer elements Question for further research: Is there a simpler stable asymptotically optimal in-place merging algorithm?

Thank you very much for your attention