Already more than three years ago, I've written a benchmark of some of the STL containers, namely the vector, the list and the deque. Since this article was very popular, I decided to improve the benchmarks and collect again all the results. There are now more benchmarks and some problems have been fixed in the benchmark code. Moreover, I have also added a new container, the plf::colony. Therefore, there are four containers tested:
- The std::vector: This is a dynamically-resized array of elements. All the elements are contiguous in memory. If an element is inserted or removed it at a position other than the end, the following elements will be moved to fill the gap or to open a gap. Elements can be accessed at random position in constant time. The array is resized so that it can several more elements, not resized at each insert operation. This means that insertion at the end of the container is done in amortized constant time.
- The std::deque: The deque is a container that offer constant time insertion both at the front and at the back of the collection. In current c++ libraries, it is implementation as a collection of dynamically allocated fixed-size array. Not all elements are contiguous, but depending on the size of the data type, this still has good data locality. Access to a random element is also done in constant time, but with more overhead than the vector. For insertions and removal at random positions, the elements are shifted either to the front or to the back meaning that it is generally faster than the vector, by twice in average.
- The std::list: This is a doubly-linked list. It supports constant time insertions at any position of the collection. However, it does not support constant time random access. The elements are obviously not contiguous, since they are all allocated in nodes. For small elements, this collection has a very big memory overhead.
- The plf::colony: This container is a non-standard container which is unordered, it means that the insertion order will not necessarily be preserved. It provides strong iterators guarantee, pointers to non-erased element are not invalidated by insertion or erasure. It is especially tailored for high-insertion/erasure workloads. Moreover, it is also specially optimized for non-scalar types, namely structs and classes with relatively large data size (greater than 128 bits on the official documentation). Its implementation is more complicated than the other containers. It is also implemented as a list of memory blocks, but they are of increasingly large sizes. When elements are erased, there position is not removed, but marked as erased so that it can be reused for fast insertion later on. This container uses the same conventions as the standard containers and was proposed for inclusion to the standard library, which is the main reason why it's included in this benchmark. If you want more information, you can consult the official website.
In the text and results, the namespaces will be omitted. Note that I have only included sequence containers in my test. These are the most common containers in practices and these also the containers I'm the most familiar with. I could have included multiset in this benchmark, but the interface and purpose being different, I didn't want the benchmark to be confusing.
All the examples are compiled with g++-4.9.4 (-std=c++11 -march=native -O2) and run on a Gentoo Linux machine with an Intel Core i7-4770 at 3.4GHz.
For each graph, the vertical axis represent the amount of time necessary to perform the operations, so the lower values are the better. The horizontal axis is always the number of elements of the collection. For some graph, the logarithmic scale could be clearer, a button is available after each graph to change the vertical scale to a logarithmic scale.
The tests are done with several different data types. The trivial data types are varying in size, they hold an array of longs and the size of the array varies to change the size of the data type. The non-trivial data type is composed of a string (just long enough to avoid SSO (Small String Optimization) (even though I'm using GCC)). The non-trivial data types comes in a second version with noexcept move operations. Not all results are presented for each data types if there are not significant differences between in order to keep this article relatively short (it's already probably too long :P).