Back to TILs

C++ std::set std::priority_queue

Date: 2022-12-18Last modified: 2023-04-05

Table of contents

Since both std::priority_queue and std::set (and std::multiset) are data containers that store elements and allow you to access them in an ordered fashion, and have same insertion complexity O(log N), what are the advantages of using one over the other (or, what kind of situations call for the one or the other?)?

While I know that the underlying structures are different, I am not as much interested in the difference in their implementation as I am in the comparison their performance and suitability for various uses.

Note: I know about the no-duplicates in a set. That’s why I also mentioned std::multiset since it has the exactly same behavior as the std::set but can be used where the data stored is allowed to compare as equal elements.

std::priority_queue allows to do the following:

while std::set has more possibilities:

A priority queue only gives you access to one element in sorted order - i.e., you can get the highest priority item, and when you remove that, you can get the next highest priority, and so on. A priority queue also allows duplicate elements, so it’s more like a multiset than a set.

[Edit: As @Tadeusz Kopec pointed out, building a heap is also linear on the number of items in the heap, where building a set is O(N log N) unless it’s being built from a sequence that’s already ordered (in which case it is also linear).]

A set allows you full access in sorted order, so you can, for example, find two elements somewhere in the middle of the set, then traverse in order from one to the other.

set/multiset are generally backed by a binary tree.

priority_queue is generally backed by a heap.

So the question is really when should you use a binary tree instead of a heap?

Both structures are laid out in a tree, however the rules about the relationship between anscestors are different.

We will call the positions P for parent, L for left child, and R for right child.

In a binary tree L < P < R.

In a heap P < L and P < R

So binary trees sort “sideways” and heaps sort “upwards”.

So if we look at this as a triangle than in the binary tree L,P,R are completely sorted, whereas in the heap the relationship between L and R is unknown (only their relationship to P).

This has the following effects:

If you have an unsorted array and want to turn it into a binary tree it takes O(nlogn) time. If you want to turn it into a heap it only takes O(N) time, (as it just compares to find the extreme element)

Heaps are more efficient if you only need the extreme element (lowest or highest by some comparison function). Heaps only do the comparisons (lazily) necessary to determine the extreme element.

Binary trees perform the comparisons necessary to order the entire collection, and keep the entire collection sorted all-the-time.

Heaps have constant-time lookup (peek) of lowest element, binary trees have logarithmic time lookup of lowest element.

  // template <class T, class Container = vector<T>,  class Compare =
  // less<typename Container::value_type> > class priority_queue;

Possible output