Heap

What is a heap

  • A heap is a binary tree storing keys at its internal nodes and satisfying the following properties:
    • Heap-Order: for every internal node v other than the root, key(v) ≥ key(parent(v))
    • Complete Binary Tree: let h be the height of the heap
      • for i = 0, … , h − 1, there are 2^i nodes of depth i
      • at depth h − 1, the internal nodes are to the left of the external nodes
    • The last node of a heap is the rightmost internal node of depth h − 1

Height of a Heap

  • Theorem: A heap storing n keys has height O(log n)
    Proof: (we apply the complete binary tree property)
    • Let h be the height of a heap storing n keys
    • Since there are 2i keys at depth i = 0, … , h − 2 and at least one key at depth h − 1, we have n ≥ 1 + 2 + 4 + … + 2h−2 + 1
    • Thus, n ≥ 2h−1 , i.e., h ≤ log n + 1

Heaps and Priority Queues

  • We can use a heap to implement a priority queue
  • We store a (key, element) item at each internal node
  • We keep track of the position of the last node
  • For simplicity, we show only the keys in the pictures

Insertion into a Heap

  • Method insertItem of the priority queue ADT corresponds to the insertion of a key k to the heap
  • The insertion algorithm consists of three steps􀂄:
    • Find the insertion node z (the new last node)
    • Store k at z and expand z into an internal node
    • Restore the heap-order property (upheap)

Upheap

  • After the insertion of a new key k, the heap-order property may be violated
  • Algorithm upheap restores the heap-order property by swapping k along an upward path from the insertion node
  • Upheap terminates when the key k reaches the root or a node whose parent has a key smaller than or equal to k
  • Since a heap has height O(log n), upheap runs in O(log n) time

Removal from a Heap

  • Method removeMin of the priority queue ADT corresponds to the removal of the root key from the heap
  • The removal algorithm onsists of three steps
    • Replace the root key with the key of the last node w
    • Compress w and its children into a leaf
    • Restore the heap-order property (downheap)

Downheap

  • After replacing the root key with the key k of the last node, the heap-order property may be violated
  • Algorithm downheap restores the heap-order property by swapping key k along a downward path from the root
  • Upheap terminates when key k reaches a leaf or a node whose children have keys greater than or equal to k
  • Since a heap has height O(log n), downheap runs in O(log n) time

Updating the Last Node

  • The insertion node can be found by traversing a path of O(log n) nodes
    • While the current node is a right child, go to the parent node
    • If the current node is a left child, go to the right child
    • While the current node is internal, go to the left child
  • Similar algorithm for updating the last node after a removal

Heap-Sort

  • Consider a priority queue with n items implemented by means of a heap
    • the space used is O(n)
    • methods insertItem and removeMin take O(log n) time
    • methods size, isEmpty, minKey, and minElement take time O(1) time
  • Using a heap-based priority queue, we can sort a sequence of n elements in O(n log n) time
  • The resulting algorithm is called heap-sort
  • Heap-sort is much faster than quadratic sorting algorithms, such as insertion-sort and selection-sort

Vector-basedHeap Implementation

TODO

Merging Two Heaps

  • We are given two two heaps and a key k
  • We create a new heap with the root node storing k and with the two heaps as subtrees
  • We perform downheap to restore the heaporder property

Bottom-up Heap Construction

  • We can construct a heap storing n given keys in using a bottom-up construction with log n phases
  • In phase i, pairs of heaps with 2i −1 keys are merged into heaps with 2i+1−1 keys

Analysis

  • We visualize the worst-case time of a downheap with a proxy path that goes first right and then repeatedly goes left until the bottom of the heap (this path may differ from the actual downheap path)
  • Since each node is traversed by at most two proxy paths, the total number of nodes of the proxy paths is O(n)
  • Thus, bottom-up heap construction runs in O(n) time
  • Bottom-up heap construction is faster than n successive insertions and speeds up the first phase of heap-sort

Code

package examples;


import java.lang.management.ManagementFactory;
import java.lang.management.ThreadMXBean;
import java.util.ArrayList;
import java.util.Random;


/**
 * @author ps
 * Various sort programs for int arrays
 */
public class SortTest {


    public static long cnt;
    static Random rand = new Random();
    static int [] b;

    /**
     * swap the array elements a[i] and a[k]
     * @param a int array
     * @param i position in the array 'a'
     * @param k position in the array 'a'
     */
    static void swap(int [] a, int i, int k){
        int tmp=a[i];
        a[i]=a[k];
        a[k]=tmp;
        cnt++;
    }

    static void heapSort(int [] a){
        // make 'a' to be an max-heap:
//        for (int i=1; i<a.length; i++) upHeap(a,i);
        for (int i=a.length/2;i>=0;i--) downHeap(a,i,a.length);
        // System.out.println("max-heap? "+heapCheck(a));
        for (int i = a.length-1; i>0; i--){
            swap(a,0,i); // now a[i] is at its final position
            downHeap(a, 0, i);
        }
    }

    private static void upHeap(int[] a, int i) {
        // assume a[0..i-1] is a max-heap, swap element
        // at position i with its parent and so on
        // until a[0..i] is a max-heap
        int parent = (i-1)/2;
        while (i>0 && a[parent] < a[i]){
            swap(a,parent,i);
            i = parent;
            parent = (i-1)/2;
        }
    }

    private static void downHeap(int[] a, int i, int len) {
        // assume a [i..len-1] is a max-heap, but the element
        // at position i possibly violates the heap condition.
        // swap a[i] with its bigger child until a[i..len-1] is a heap.
        int left = i*2+1;
        while (left < len){
            int cand = left;
            int right = left+1;
            // find the bigger child (if there are two)
            if (right < len && a[right] > a[left]) cand = right;
            if (a[i] >= a[cand]) break;
            swap(a,i,cand);
            i = cand;
            left = i*2+1;
        }         
    }


    private static boolean heapCheck(int [] a){
        for (int i=1;i<a.length;i++) if (a[(i-1)/2]<a[i]) return false;
        return true; // we have a correct max-heap
    }

    public static void main(String[] args) {
        long t1=0,t2=0,te1=0,te2=0,eTime=0,time=0;
        int n = 50000000;
        // we need a random generator
        Random rand=new Random(Integer.MAX_VALUE);
        //rand.setSeed(54326346); // initialize always in the same state
        ThreadMXBean threadBean = ManagementFactory.getThreadMXBean();
        // new array
        int [] a = new int[n];
        // fill it randomly
        for (int i=0;i<a.length ;i++) {
            a[i]=rand.nextInt(n);
        }
        for (int i=0;i<a.length ;i++) {
            swap(a,i,rand.nextInt(n-1));
        }

        cnt=0;  // for statistcs reasons
        // get Time
        te1=System.nanoTime();
        t1 = threadBean.getCurrentThreadCpuTime();
        heapSort(a);

        // System.out.println("heap? "+heapCheck(a));
        te2 = System.nanoTime();
        t2 = threadBean.getCurrentThreadCpuTime();
        time=t2-t1;
        eTime=te2-te1;
        System.out.println("# elements: "+n);
        System.out.println("CPU-Time usage: "+time/1000000.0+" ms");
        System.out.println("elapsed time: "+eTime/1e6+" ms");
        System.out.println("sorted? "+heapCheck(a));
        System.out.println("swap operation needed: "+cnt);
    }
}