Heap (data structure)

In computer science, a heap is a specialized tree-based data structure which is essentially an almost complete[1] tree that satisfies the heap property: in a max heap, for any given node C, if P is a parent node of C, then the key (the value) of P is greater than or equal to the key of C. In a min heap, the key of P is less than or equal to the key of C.[2] The node at the "top" of the heap (with no parents) is called the root node.

Example of a binary max-heap with node keys being integers from 1 to 100

The heap is one maximally efficient implementation of an abstract data type called a priority queue, and in fact, priority queues are often referred to as "heaps", regardless of how they may be implemented. In a heap, the highest (or lowest) priority element is always stored at the root. However, a heap is not a sorted structure; it can be regarded as being partially ordered. A heap is a useful data structure when it is necessary to repeatedly remove the object with the highest (or lowest) priority.

A common implementation of a heap is the binary heap, in which the tree is a binary tree (see figure). The heap data structure, specifically the binary heap, was introduced by J. W. J. Williams in 1964, as a data structure for the heapsort sorting algorithm.[3] Heaps are also crucial in several efficient graph algorithms such as Dijkstra's algorithm. When a heap is a complete binary tree, it has a smallest possible height—a heap with N nodes and for each node a branches always has log_a N height.

Note that, as shown in the graphic, there is no implied ordering between siblings or cousins and no implied sequence for an in-order traversal (as there would be in, e.g., a binary search tree). The heap relation mentioned above applies only between nodes and their parents, grandparents, etc. The maximum number of children each node can have depends on the type of heap.

Operations

The common operations involving heaps are:

Basic

find-max (or find-min): find a maximum item of a max-heap, or a minimum item of a min-heap, respectively (a.k.a. peek)
insert: adding a new key to the heap (a.k.a., push[4])
extract-max (or extract-min): returns the node of maximum value from a max heap [or minimum value from a min heap] after removing it from the heap (a.k.a., pop[5])
delete-max (or delete-min): removing the root node of a max heap (or min heap), respectively
replace: pop root and push a new key. More efficient than pop followed by push, since only need to balance once, not twice, and appropriate for fixed-size heaps.[6]

Creation

create-heap: create an empty heap
heapify: create a heap out of given array of elements
merge (union): joining two heaps to form a valid new heap containing all the elements of both, preserving the original heaps.
meld: joining two heaps to form a valid new heap containing all the elements of both, destroying the original heaps.

Inspection

size: return the number of items in the heap.
is-empty: return true if the heap is empty, false otherwise.

Internal

increase-key or decrease-key: updating a key within a max- or min-heap, respectively
delete: delete an arbitrary node (followed by moving last node and sifting to maintain heap)
sift-up: move a node up in the tree, as long as needed; used to restore heap condition after insertion. Called "sift" because node moves up the tree until it reaches the correct level, as in a sieve.
sift-down: move a node down in the tree, similar to sift-up; used to restore heap condition after deletion or replacement.

Implementation

Heaps are usually implemented with an implicit heap data structure, which is an implicit data structure consisting of an array (fixed size or dynamic array) where each element represents a tree node whose parent/children relationship is defined implicitly by their index. After an element is inserted into or deleted from a heap, the heap property may be violated and the heap must be balanced by swapping elements within the array.

Example of a complete binary max-heap with node keys being integers from 1 to 100 and how it would be stored in an array.

In an implicit heap data structure, the first (or last) element will contain the root. The next two elements of the array contain its children. The next four contain the four children of the two child nodes, etc. Thus the children of the node at position n would be at positions 2n and 2n + 1 in a one-based array, or 2n + 1 and 2n + 2 in a zero-based array. Computing the index of the parent node of n-th element is also straightforward. For one-based arrays the parent of element n is located at position n/2. Similarly, for zero-based arrays, is the parent is located at position (n-1)/2 (floored). This allows moving up or down the tree by doing simple index computations. Balancing a heap is done by sift-up or sift-down operations (swapping elements which are out of order). As we can build a heap from an array without requiring extra memory (for the nodes, for example), heapsort can be used to sort an array in-place.

Different types of heaps implement the operations in different ways, but notably, insertion is often done by adding the new element at the end of the heap in the first available free space. This will generally violate the heap property, and so the elements are then shifted up until the heap property has been reestablished. Similarly, deleting the root is done by removing the root and then putting the last element in the root and sifting down to rebalance. Thus replacing is done by deleting the root and putting the new element in the root and sifting down, avoiding a sifting up step compared to pop (sift down of last element) followed by push (sift up of new element).

Construction of a binary (or d-ary) heap out of a given array of elements may be performed in linear time using the classic Floyd algorithm, with the worst-case number of comparisons equal to 2N − 2s₂(N) − e₂(N) (for a binary heap), where s₂(N) is the sum of all digits of the binary representation of N and e₂(N) is the exponent of 2 in the prime factorization of N.[7] This is faster than a sequence of consecutive insertions into an originally empty heap, which is log-linear.[lower-alpha 1]

Variants

2–3 heap
B-heap
Beap
Binary heap
Binomial heap
Brodal queue
d-ary heap
Fibonacci heap
K-D Heap
Leaf heap
Leftist heap
Pairing heap
Radix heap
Randomized meldable heap
Skew heap
Soft heap
Ternary heap
Treap
Weak heap

Comparison of theoretic bounds for variants

Here are time complexities[8] of various heap data structures. Function names assume a min-heap. For the meaning of "O(f)" and "Θ(f)" see Big O notation.

Operation	find-min	delete-min	insert	decrease-key	meld
Binary[8]	Θ(1)	Θ(log n)	O(log n)	O(log n)	Θ(n)
Leftist	Θ(1)	Θ(log n)	Θ(log n)	O(log n)	Θ(log n)
Binomial[8][9]	Θ(1)	Θ(log n)	Θ(1)[lower-alpha 2]	Θ(log n)	O(log n)[lower-alpha 3]
Fibonacci[8][10]	Θ(1)	O(log n)[lower-alpha 2]	Θ(1)	Θ(1)[lower-alpha 2]	Θ(1)
Pairing[11]	Θ(1)	O(log n)[lower-alpha 2]	Θ(1)	o(log n)[lower-alpha 2][lower-alpha 4]	Θ(1)
Brodal[14][lower-alpha 5]	Θ(1)	O(log n)	Θ(1)	Θ(1)	Θ(1)
Rank-pairing[16]	Θ(1)	O(log n)[lower-alpha 2]	Θ(1)	Θ(1)[lower-alpha 2]	Θ(1)
Strict Fibonacci[17]	Θ(1)	O(log n)	Θ(1)	Θ(1)	Θ(1)
2-3 heap[18]	O(log n)	O(log n)[lower-alpha 2]	O(log n)[lower-alpha 2]	Θ(1)	?

Each insertion takes O(log(k)) in the existing size of the heap, thus $\sum _{k=1}^{n}O(\log k)$ . Since $\log n/2=(\log n)-1$ , a constant factor (half) of these insertions are within a constant factor of the maximum, so asymptotically we can assume $k=n$ ; formally the time is $nO(\log n)-O(n)=O(n\log n)$ . This can also be readily seen from Stirling's approximation.
Amortized time.
n is the size of the larger heap.
Lower bound of $\Omega (\log \log n),$ [12] upper bound of $O(2^{2{\sqrt {\log \log n}}}).$ [13]
Brodal and Okasaki later describe a persistent variant with the same bounds except for decrease-key, which is not supported. Heaps with n elements can be constructed bottom-up in O(n).[15]

Applications

The heap data structure has many applications.

Heapsort: One of the best sorting methods being in-place and with no quadratic worst-case scenarios.
Selection algorithms: A heap allows access to the min or max element in constant time, and other selections (such as median or kth-element) can be done in sub-linear time on data that is in a heap.[19]
Graph algorithms: By using heaps as internal traversal data structures, run time will be reduced by polynomial order. Examples of such problems are Prim's minimal-spanning-tree algorithm and Dijkstra's shortest-path algorithm.
Priority Queue: A priority queue is an abstract concept like "a list" or "a map"; just as a list can be implemented with a linked list or an array, a priority queue can be implemented with a heap or a variety of other methods.
K-way merge: A heap data structure is useful to merge many already-sorted input streams into a single sorted output stream. Examples of the need for merging include external sorting and streaming results from distributed data such as a log structured merge tree. The inner loop is obtaining the min element, replacing with the next element for the corresponding input stream, then doing a sift-down heap operation. (Alternatively the replace function.) (Using extract-max and insert functions of a priority queue are much less efficient.)
Order statistics: The Heap data structure can be used to efficiently find the kth smallest (or largest) element in an array.

Implementations

The C++ Standard Library provides the make_heap, push_heap and pop_heap algorithms for heaps (usually implemented as binary heaps), which operate on arbitrary random access iterators. It treats the iterators as a reference to an array, and uses the array-to-heap conversion. It also provides the container adaptor priority_queue, which wraps these facilities in a container-like class. However, there is no standard support for the replace, sift-up/sift-down, or decrease/increase-key operations.
The Boost C++ libraries include a heaps library. Unlike the STL, it supports decrease and increase operations, and supports additional types of heap: specifically, it supports d-ary, binomial, Fibonacci, pairing and skew heaps.
There is a generic heap implementation for C and C++ with D-ary heap and B-heap support. It provides an STL-like API.
The standard library of the D programming language includes std.container.BinaryHeap, which is implemented in terms of D's ranges. Instances can be constructed from any random-access range. BinaryHeap exposes an input range interface that allows iteration with D's built-in foreach statements and integration with the range-based API of the std.algorithm package.
The Java platform (since version 1.5) provides a binary heap implementation with the class java.util.PriorityQueue in the Java Collections Framework. This class implements by default a min-heap; to implement a max-heap, programmer should write a custom comparator. There is no support for the replace, sift-up/sift-down, or decrease/increase-key operations.
Python has a heapq module that implements a priority queue using a binary heap. The library exposes a heapreplace function to support k-way merging.
PHP has both max-heap (SplMaxHeap) and min-heap (SplMinHeap) as of version 5.3 in the Standard PHP Library.
Perl has implementations of binary, binomial, and Fibonacci heaps in the Heap distribution available on CPAN.
The Go language contains a heap package with heap algorithms that operate on an arbitrary type that satisfies a given interface. That package does not support the replace, sift-up/sift-down, or decrease/increase-key operations.
Apple's Core Foundation library contains a CFBinaryHeap structure.
Pharo has an implementation of a heap in the Collections-Sequenceable package along with a set of test cases. A heap is used in the implementation of the timer event loop.
The Rust programming language has a binary max-heap implementation, BinaryHeap, in the collections module of its standard library.

References

CORMEN, THOMAS H. (2009). INTRODUCTION TO ALGORITHMS. United States of America: The MIT Press Cambridge, Massachusetts London, England. pp. 151–152. ISBN 978-0-262-03384-8.
Black (ed.), Paul E. (2004-12-14). Entry for heap in Dictionary of Algorithms and Data Structures. Online version. U.S. National Institute of Standards and Technology, 14 December 2004. Retrieved on 2017-10-08 from https://xlinux.nist.gov/dads/HTML/heap.html.
Williams, J. W. J. (1964), "Algorithm 232 - Heapsort", Communications of the ACM, 7 (6): 347–348, doi:10.1145/512274.512284
The Python Standard Library, 8.4. heapq — Heap queue algorithm, heapq.heappush
The Python Standard Library, 8.4. heapq — Heap queue algorithm, heapq.heappop
The Python Standard Library, 8.4. heapq — Heap queue algorithm, heapq.heapreplace
Suchenek, Marek A. (2012), "Elementary Yet Precise Worst-Case Analysis of Floyd's Heap-Construction Program", Fundamenta Informaticae, IOS Press, 120 (1): 75–92, doi:10.3233/FI-2012-751.
Cormen, Thomas H.; Leiserson, Charles E.; Rivest, Ronald L. (1990). Introduction to Algorithms (1st ed.). MIT Press and McGraw-Hill. ISBN 0-262-03141-8.
"Binomial Heap | Brilliant Math & Science Wiki". brilliant.org. Retrieved 2019-09-30.
Fredman, Michael Lawrence; Tarjan, Robert E. (July 1987). "Fibonacci heaps and their uses in improved network optimization algorithms" (PDF). Journal of the Association for Computing Machinery. 34 (3): 596–615. CiteSeerX 10.1.1.309.8927. doi:10.1145/28869.28874.CS1 maint: ref=harv (link)
Iacono, John (2000), "Improved upper bounds for pairing heaps", Proc. 7th Scandinavian Workshop on Algorithm Theory (PDF), Lecture Notes in Computer Science, 1851, Springer-Verlag, pp. 63–77, arXiv:1110.4428, CiteSeerX 10.1.1.748.7812, doi:10.1007/3-540-44985-X_5, ISBN 3-540-67690-2
Fredman, Michael Lawrence (July 1999). "On the Efficiency of Pairing Heaps and Related Data Structures" (PDF). Journal of the Association for Computing Machinery. 46 (4): 473–501. doi:10.1145/320211.320214.
Pettie, Seth (2005). Towards a Final Analysis of Pairing Heaps (PDF). FOCS '05 Proceedings of the 46th Annual IEEE Symposium on Foundations of Computer Science. pp. 174–183. CiteSeerX 10.1.1.549.471. doi:10.1109/SFCS.2005.75. ISBN 0-7695-2468-0.
Brodal, Gerth S. (1996), "Worst-Case Efficient Priority Queues" (PDF), Proc. 7th Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 52–58
Goodrich, Michael T.; Tamassia, Roberto (2004). "7.3.6. Bottom-Up Heap Construction". Data Structures and Algorithms in Java (3rd ed.). pp. 338–341. ISBN 0-471-46983-1.
Haeupler, Bernhard; Sen, Siddhartha; Tarjan, Robert E. (November 2011). "Rank-pairing heaps" (PDF). SIAM J. Computing. 40 (6): 1463–1485. doi:10.1137/100785351.
Brodal, Gerth Stølting; Lagogiannis, George; Tarjan, Robert E. (2012). Strict Fibonacci heaps (PDF). Proceedings of the 44th symposium on Theory of Computing - STOC '12. pp. 1177–1184. CiteSeerX 10.1.1.233.1740. doi:10.1145/2213977.2214082. ISBN 978-1-4503-1245-5.
Takaoka, Tadao (1999), Theory of 2–3 Heaps (PDF), p. 12
Frederickson, Greg N. (1993), "An Optimal Algorithm for Selection in a Min-Heap", Information and Computation (PDF), 104, Academic Press, pp. 197–214, doi:10.1006/inco.1993.1030, archived from the original (PDF) on 2012-12-03, retrieved 2010-10-31

External links

Heap at Wolfram MathWorld
Explanation of how the basic heap algorithms work
Bentley, Jon Louis (2000). Programming Pearls (2nd ed.). Addison Wesley. pp. 147–162. ISBN 0201657880.

This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.

[8] Each insertion takes O(log(k)) in the existing size of the heap, thus $\sum _{k=1}^{n}O(\log k)$ . Since $\log n/2=(\log n)-1$ , a constant factor (half) of these insertions are within a constant factor of the maximum, so asymptotically we can assume $k=n$ ; formally the time is $nO(\log n)-O(n)=O(n\log n)$ . This can also be readily seen from Stirling's approximation.

[amortized-11] Amortized time.

[meld-12] s the size of the larger heap.

[pairingdecreasekey-17] Lower bound of $\Omega (\log \log n),$ [12] upper bound of $O(2^{2{\sqrt {\log \log n}}}).$ [13]

[brodal-20] Brodal and Okasaki later describe a persistent variant with the same bounds except for decrease-key, which is not supported. Heaps with n elements can be constructed bottom-up in O(n).[15]

[1] CORMEN, THOMAS H. (2009). INTRODUCTION TO ALGORITHMS. United States of America: The MIT Press Cambridge, Massachusetts London, England. pp. 151–152. ISBN 978-0-262-03384-8.

[2] Black (ed.), Paul E. (2004-12-14). Entry for heap in Dictionary of Algorithms and Data Structures. Online version. U.S. National Institute of Standards and Technology, 14 December 2004. Retrieved on 2017-10-08 from https://xlinux.nist.gov/dads/HTML/heap.html.

[3] Williams, J. W. J. (1964), "Algorithm 232 - Heapsort", Communications of the ACM, 7 (6): 347–348, doi:10.1145/512274.512284

[4] The Python Standard Library, 8.4. heapq — Heap queue algorithm, heapq.heappush

[5] The Python Standard Library, 8.4. heapq — Heap queue algorithm, heapq.heappop

[6] The Python Standard Library, 8.4. heapq — Heap queue algorithm, heapq.heapreplace

[7] Suchenek, Marek A. (2012), "Elementary Yet Precise Worst-Case Analysis of Floyd's Heap-Construction Program", Fundamenta Informaticae, IOS Press, 120 (1): 75–92, doi:10.3233/FI-2012-751.

[CLRS-9] Cormen, Thomas H.; Leiserson, Charles E.; Rivest, Ronald L. (1990). Introduction to Algorithms (1st ed.). MIT Press and McGraw-Hill. ISBN 0-262-03141-8.

[10] "Binomial Heap | Brilliant Math & Science Wiki". brilliant.org. Retrieved 2019-09-30.

[Fredman_And_Tarjan-13] Fredman, Michael Lawrence; Tarjan, Robert E. (July 1987). "Fibonacci heaps and their uses in improved network optimization algorithms" (PDF). Journal of the Association for Computing Machinery. 34 (3): 596–615. CiteSeerX 10.1.1.309.8927. doi:10.1145/28869.28874.CS1 maint: ref=harv (link)

[Iacono-14] Iacono, John (2000), "Improved upper bounds for pairing heaps", Proc. 7th Scandinavian Workshop on Algorithm Theory (PDF), Lecture Notes in Computer Science, 1851, Springer-Verlag, pp. 63–77, arXiv:1110.4428, CiteSeerX 10.1.1.748.7812, doi:10.1007/3-540-44985-X_5, ISBN 3-540-67690-2

[Fredman1999-15] Fredman, Michael Lawrence (July 1999). "On the Efficiency of Pairing Heaps and Related Data Structures" (PDF). Journal of the Association for Computing Machinery. 46 (4): 473–501. doi:10.1145/320211.320214.

[16] Pettie, Seth (2005). Towards a Final Analysis of Pairing Heaps (PDF). FOCS '05 Proceedings of the 46th Annual IEEE Symposium on Foundations of Computer Science. pp. 174–183. CiteSeerX 10.1.1.549.471. doi:10.1109/SFCS.2005.75. ISBN 0-7695-2468-0.

[18] Brodal, Gerth S. (1996), "Worst-Case Efficient Priority Queues" (PDF), Proc. 7th Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 52–58

[19] Goodrich, Michael T.; Tamassia, Roberto (2004). "7.3.6. Bottom-Up Heap Construction". Data Structures and Algorithms in Java (3rd ed.). pp. 338–341. ISBN 0-471-46983-1.

[21] Haeupler, Bernhard; Sen, Siddhartha; Tarjan, Robert E. (November 2011). "Rank-pairing heaps" (PDF). SIAM J. Computing. 40 (6): 1463–1485. doi:10.1137/100785351.

[22] Brodal, Gerth Stølting; Lagogiannis, George; Tarjan, Robert E. (2012). Strict Fibonacci heaps (PDF). Proceedings of the 44th symposium on Theory of Computing - STOC '12. pp. 1177–1184. CiteSeerX 10.1.1.233.1740. doi:10.1145/2213977.2214082. ISBN 978-1-4503-1245-5.

[23] Takaoka, Tadao (1999), Theory of 2–3 Heaps (PDF), p. 12

[24] Frederickson, Greg N. (1993), "An Optimal Algorithm for Selection in a Min-Heap", Information and Computation (PDF), 104, Academic Press, pp. 197–214, doi:10.1006/inco.1993.1030, archived from the original (PDF) on 2012-12-03, retrieved 2010-10-31

Tree data structures
Search trees (dynamic sets/associative arrays)	2–3 2–3–4 AA (a,b) AVL B B+ B* B^x (Optimal) Binary search Dancing HTree Interval Order statistic (Left-leaning) Red-black Scapegoat Splay T Treap UB Weight-balanced
Heaps	Binary Binomial Brodal Fibonacci Leftist Pairing Skew van Emde Boas Weak
Tries	Ctrie C-trie (compressed ADT) Hash Radix Suffix Ternary search X-fast Y-fast
Spatial data partitioning trees	Ball BK BSP Cartesian Hilbert R k-d (implicit k-d) M Metric MVP Octree Priority R Quad R R+ R* Segment VP X
Other trees	Cover Exponential Fenwick Finger Fractal tree index Fusion Hash calendar iDistance K-ary Left-child right-sibling Link/cut Log-structured merge Merkle PQ Range SPQR Top

Data structures
Types	Collection Container
Abstract	Associative array Multimap List Stack Queue Double-ended queue Priority queue Double-ended priority queue Set Multiset Disjoint-set
Arrays	Bit array Circular buffer Dynamic array Hash table Hashed array tree Sparse matrix
Linked	Association list Linked list Skip list Unrolled linked list XOR linked list
Trees	B-tree Binary search tree AA tree AVL tree Red–black tree Self-balancing tree Splay tree Heap Binary heap Binomial heap Fibonacci heap R-tree R* tree R+ tree Hilbert R-tree Trie Hash tree
Graphs	Binary decision diagram Directed acyclic graph Directed acyclic word graph
List of data structures