Merge Sort
Divide-and-Conquer
- Divide-and conquer is a general algorithm design paradigm:
- Divide: divide the input data S in two disjoint subsets S1 and S2
- Recur: solve the subproblems associated with S1 and S2
- Conquer: combine the solutions for S1 and S2 into a solution for S
- The base case for the recursion are subproblems of size 0 or 1
- Merge-sort is a sorting algorithm based on the divide-and-conquer paradigm
- Like heap-sort
- It uses a comparator
- It has O(n log n) running time
- Unlike heap-sort
- It does not use an auxiliary priority queue
- It accesses data in a sequential manner (suitable to sort data on a disk)
Merging Two Sorted Sequences
- The conquer step of merge-sort consists of merging two sorted sequences A and B into a sorted sequence S containing the union of the elements of A and B
- Merging two sorted sequences, each with n/2 elements and implemented by means of a doubly linked list, takes O(n) time
Algorithm
Algorithm mergeSort(S, C)
Input sequence S with n
elements, comparator C
Output sequence S sorted
according to C
if S.size() > 1
(S1, S2)← partition(S, n/2)
mergeSort(S1, C)
mergeSort(S2, C)
S ← merge(S1, S2)
Algorithm merge(A, B)
Input sequences A and B with
n/2 elements each
Output sorted sequence of A ∪ B
S ← empty sequence
while ¬A.isEmpty() ∧ ¬B.isEmpty()
if A.first().element() < B.first().element()
S.insertLast(A.remove(A.first()))
else
S.insertLast(B.remove(B.first()))
while ¬A.isEmpty()
S.insertLast(A.remove(A.first()))
while ¬B.isEmpty()
S.insertLast(B.remove(B.first()))
return S
Merge-Sort Tree
- An execution of merge-sort is depicted by a binary tree
- each node represents a recursive call of merge-sort and stores
- unsorted sequence before the execution and its partition
- sorted sequence at the end of the execution
- the root is the initial call
- the leaves are calls on subsequences of size 0 or 1
![Merge_Sort_Tree](Images/Merge_Sort_Tree.PNG)
Analysis of Merge-Sort
- The height h of the merge-sort tree is O(log n)
- at each recursive call we divide in half the sequence,
- The overall amount or work done at the nodes of depth i is O(n)
- we partition and merge 2i sequences of size n/2i
- we make 2i+1 recursive calls
- Thus, the total running time of merge-sort is O(n log n)
![Merge Sort](Images/Merge_Sort.PNG)
Comparison
![Sort_Types](Images/Sort_Types.PNG)
Code
package examples;
import java.lang.management.ManagementFactory;
import java.lang.management.ThreadMXBean;
import java.util.ArrayList;
import java.util.Random;
public class SortTest {
public static long cnt;
static Random rand = new Random();
static int [] b;
public static boolean sortCheck(int[] a) {
for (int i=0;i<a.length-1;i++){
if (a[i]>a[i+1]) return false;
}
return true;
}
public static void mergeSort(int[]a){
b = new int[a.length];
mSort(a,0,a.length-1);
}
private static void mSort(int[] a, int from, int to) {
if (from == to) return;
int mid = (from+to)/2;
mSort(a,from,mid);
mSort(a,mid+1,to);
merge(a,from,mid,to);
}
private static void merge(int[] a, int from, int mid, int to) {
int fPtr=from,sPtr=mid+1,toPtr=from;
while(true){
if (fPtr > mid) break;
else if( sPtr>to) {
while(fPtr <= mid) b[toPtr++] = a[fPtr++];
break;
}
else {
if (a[sPtr] > a[fPtr]) b[toPtr++]=a[fPtr++];
else b[toPtr++] = a[sPtr++];
}
}
while (--toPtr >= from) a[toPtr] = b[toPtr];
}
public static void main(String[] args) {
long t1=0,t2=0,te1=0,te2=0,eTime=0,time=0;
int n = 50000000;
Random rand=new Random(Integer.MAX_VALUE);
ThreadMXBean threadBean = ManagementFactory.getThreadMXBean();
int [] a = new int[n];
for (int i=0;i<a.length ;i++) {
a[i]=rand.nextInt(n);
}
for (int i=0;i<a.length ;i++) {
swap(a,i,rand.nextInt(n-1));
}
cnt=0;
te1=System.nanoTime();
t1 = threadBean.getCurrentThreadCpuTime();
mergeSort(a);
te2 = System.nanoTime();
t2 = threadBean.getCurrentThreadCpuTime();
time=t2-t1;
eTime=te2-te1;
System.out.println("# elements: "+n);
System.out.println("CPU-Time usage: "+time/1000000.0+" ms");
System.out.println("elapsed time: "+eTime/1e6+" ms");
System.out.println("sorted? "+sortCheck(a));
System.out.println("swap operation needed: "+cnt);
}
}