Fast priority queues in Golang: Hierarchical Queue

A 4 minutes story written on Sep 2017 by Adrian B.G.

Writing O(1) high performance data structures is a multi-post series:

  1. Hierarchical Queue (this article)
  2. Hierarchical Heap
  3. Ladder Queue (soon)

A priority queue is an abstract data structure, used to store values (any data) with a priority. You can insert data at any time with any priority, but you can only take out the value with the highest priority.

image

Magic

When to use priority queues ⁉

I confess, I didn’t use many of them in my apps, but there are a few problems perfect for it:

  • triage — any processing you have based on a score/importance
  • ordering — when your data has a specific order of processing/analysis, ex: process the DAU KPI before MAU
  • Event driven simulation — priorities can be timestamps
  • Graph searching
  • Load balancing — the priority is the inverse of the load

Baby steps

I started a long last relationship with Go. I didn’t want my training to be in vain so I began to look for high performance data structures to implement (that aren’t already). After a few searches I ended up with this image, snippet from a book

image

The 2 bottom lines attracted my interest, they are O(1) multi-structure priority queues. First of them is the Hierarchical Heap, which is based on a Hierarchical Queue, so let’s dive in.

image

Hierarchical Queue

I couldn’t find any open-source code or pseudocode for this structure, if you find any issue please let me know.

“When the priority values are limited to small integers (e.g. digital images often have 8-bit or 12-bit integers as pixel values when they come off the imaging sensor), it is possible to allocate a FIFO queue (bucket) for each possible priority value. An array contains a pointer to each of these buckets, and when enqueueing an element, the correct bucket can be directly indexed using the priority value. Both enqueue and dequeue are O(1) operations.”

Papers used for the algorithm and snippets:

It is a simple structure, very fast but it has a few limitations.

  1. It has only a limited number of priority values (in my implementation is uin8(0–255)). Each value has it’s own queue, so increasing the number you end up with a memory overhead.

  2. Once the highest priority queue is empty, it is removed and the next queue can begin to empty, and it cannot be created again. This means we must fill the queues before we start to dequeue, otherwise our performance will decrease , example: The dequeue process has only 1 priority left (255) and we enqueue other elements. All the new elements will be pushed in the 255 queue (because the 0–254 are empty and removed).

image

Go code go

Priorities can be uint8 and values interface{} .

I decided to remove the 2nd restriction, I think it made the structure too restrictive for most of the real world scenarios. I managed to keep avery similar performance (≤ 50ns per operation). I cached the highest priority that has values and start to dequeue from there. In some cases, the Dequeue process can be O(1 + k), where K number of empty queues, but overall is amortized if the priorities are well balanced.

This means that the life of the structure is extended, it can be reused and higher priority values can be added even after the dequeue process started.

In the first iteration I used linked lists as queues (used golang list.List), and the average operation time was 150–200ns. I ended up using a faster structure (thanks to a suggestion of Egon Elbre): github.com/karalabe/cookiejar/collection/deque. It reduced the operation time down to under 50ns (it’s a pretty fast damn list).

Because concurrency is important in Go, the structure has a mutex. It can be used manually or automatically (each function call blocks the mutex).

To be as generic as could be, the queues can store any data type interface{}.

Packing up

The Hierarchical Queue structure is part of “priorityqueue” package, it has 100% test coverage, examples, benchmarks and documentation.

Next

The next data structure is the Hierarchical Heap, which is based on the Hierarchical Queue, removing it’s limitations for a small cost of performance.

Fast priority queues in Golang: Hierarchical Heap

Contribution

It is my first library written in Go, I never used these algorithm in production, so please correct me if I did something wrong.

Thanks! 🤝

Please share the article, subscribe or send me your feedback so I can improve the following posts!

I curate a list of articles, talks and papers for one/two times per month. They are mostly related to computer science, distributed systems, databases, Go, containers and Cloud solutions.

comments powered by Disqus