Handling Non-Movable and Non-Copyable Types in Vectors

When working in C++, we sometimes encounter special types such as mutexes, file descriptors, or hardware resources that cannot be copied or moved due to their unique ownership semantics. A common challenge arises when you want to store these non-movable and non-copyable objects in a container like std::vector.

Why Can't We Use `std::vector`?

The standard library container std::vector requires its contained types to be either copyable or movable because it needs to reallocate and move objects around internally as it grows. This is especially problematic when calling methods like emplace_back, which internally may trigger a resize and consequently a move operation:

std::vector<std::mutex> vec;
vec.emplace_back(); // error: mutex is non-copyable and non-movable

Since std::mutex explicitly forbids copying and moving, the above code will not compile.

Why Not `std::array`?

Another common container, std::array, also isn't suitable for this scenario for two main reasons:

Fixed Compile-time Size: The size of std::array must be known at compile time. This makes it inflexible for cases where the required capacity might vary or only be known at runtime.
Default Construction: std::array default-initializes all elements, requiring types to be default-initializable. This is problematic for types that manage resources or enforce strict ownership and cannot be trivially default-initialized.

Why Not `std::vector<T*>`?

Another alternative you might consider is using a container like std::vector<std::unique_ptr<T>> or std::vector<T*>. Although this avoids moving the actual objects, it introduces a host of performance drawbacks:

Using such a container is often ill-advised in performance-critical code because each pointer element manages its own separately allocated object, resulting in many distinct heap allocations rather than a single contiguous memory block. This multiplier of allocations incurs significant allocator overhead and leads to memory fragmentation. Furthermore, iterating over a such a container forces the CPU to chase pointers across scattered memory locations, greatly degrading cache locality and increasing cache miss rates. Modern CPU hardware prefetchers optimize for linear access patterns, but the double indirection introduced by the (smart) pointers severely disrupts these heuristics, exacerbating pipeline stalls and decreasing performance.

Our benchmarks, presented later, demonstrate just how substantial these performance penalties can be.

Introducing `static_vector`

To address the issue, one solution is to create a specialized container, which we will call static_vector. This container allocates memory upfront and never reallocates. Thus, it doesn't require objects to be movable or copyable, as elements are constructed directly in place without ever being moved.

The primary trade-off of static_vector is its fixed capacity — ideal for static or preplanned use-cases, but inappropriate for dynamic runtime-resizing scenarios.

Note

For simplicity, the following implementation omits allocator template parameters and additional member functions that might be expected in a fully-featured container. The focus here is solely on supporting non-copyable and non-movable types through safe, in-place construction.

Here's an example implementation:

template<class T>
class static_vector
{
public:
    constexpr static_vector() noexcept = delete;
    constexpr static_vector(const static_vector&) noexcept = delete;
    constexpr auto operator=(const static_vector&) noexcept -> static_vector& = delete;

    explicit static_vector(std::size_t capacity) noexcept : cap_(capacity)
    {
        data_ = static_cast<T*>(std::allocator<T>{}.allocate(capacity));
    }

    constexpr static_vector(static_vector&& other) noexcept
        : cap_(other.cap_),
          size_(std::exchange(other.size_, 0)),
          data_(std::exchange(other.data_, nullptr))
    {}

    auto operator=(static_vector&& other) noexcept -> static_vector&
    {
        if(this != &other) {
            std::destroy_n(data_, size_);
            std::allocator<T>{}.deallocate(data_, cap_);
            cap_ = other.cap_;
            size_ = std::exchange(other.size_, 0);
            data_ = std::exchange(other.data_, nullptr);
        }
        return *this;
    }

    ~static_vector() noexcept
    {
        std::destroy_n(data_, size_);
        std::allocator<T>{}.deallocate(data_, cap_);
    }

    template<class... Args>
    constexpr auto emplace_back(Args&&... args) noexcept -> void
    {
        new(data_ + size_) T{std::forward<Args>(args)...};
        size_++;
    }

    template<class Self>
    constexpr auto operator[](this Self&& self, std::size_t idx) noexcept
        -> decltype(auto)
    {
        return std::forward<Self>(self).data_[idx];
    }

    constexpr auto begin() noexcept -> T* { return data_; }
    constexpr auto end() noexcept -> T* { return data_ + size_; }

    constexpr auto size() const noexcept -> std::size_t { return size_; }
    constexpr auto capacity() const noexcept -> std::size_t { return cap_; }

private:
    std::size_t cap_;
    std::size_t size_ = 0;
    T* data_ = nullptr;
};

How Does It Work?

The key to static_vector is its preallocation of memory and its guarantee never to reallocate or move existing elements.

This means:

No resizing: Capacity is fixed at construction.
No moves or copies: Objects remain exactly where they were constructed.
Safe for non-movable/non-copyable types: Great for mutexes, unique hardware handles, and similar objects.

Example:

Here's how you can use it to store mutexes:

static_vector<std::mutex> mutexes(10);

mutexes.emplace_back();
mutexes.emplace_back();

mutexes[0].lock();
// critical section
mutexes[0].unlock();

Capacity Planning

In our coroutine runtime, we rely on static_vector to store workers and synchronization primitives like std::mutex. The number of worker threads typically corresponds to the number of CPU cores, so we can determine the required capacity at runtime during initialization.

This one-time planning phase ensures we don't over- or under-allocate, while maintaining the safety guarantees we need when working with non-movable types. Because these workers and mutexes must remain in-place and unmodified throughout the program’s lifecycle, static_vector gives us exactly the right semantics.

However, it's crucial to carefully manage the number of calls to emplace_back. Since static_vector doesn't support resizing, adding elements beyond its predefined capacity is undefined behavior. Always ensure that your program's logic prevents exceeding the container's capacity—preferably by clearly asserting or checking capacity before attempting to construct new elements.

There are different approaches to handling this scenario. Let us present some of them:

Do Nothing and Hope

Leave the class exactly as it is in the previous listing. If client code calls emplace_back() after the vector is full, the behaviour is undefined. In practice you usually add an assert(size_ < cap_) that terminates the program in debug builds. This is most straightforward, but also the the most dangerous option. While adding an assert is helpful in debug builds, a runtime check (std::terminate()) might be safer in production-critical scenarios where an overflow must never silently fail. Keep in mind that this comes with a performance hit.

template<class... Args>
constexpr void emplace_back(Args&&... args) noexcept
{
    assert(size_ < cap_ && "static_vector capacity exceeded!");

    if (size_ >= cap_) [[unlikely]] {
        std::terminate();
    }
    new (data_ + size_) T{std::forward<Args>(args)...};
    ++size_;
}

Wrap-Around When Full

Once the write position reaches cap_, we cycle back to index 0 and start overwriting the oldest element — exactly how a circular queue or log buffer behaves. size_ is capped at cap_, and we add a head_ index that advances modulo cap_.

template<class T>
class ring_static_vector
{
public:
    template<class... Args>
    void emplace_back(Args&&... args)
    {
        if (size_ < cap_) {
            new (data_ + head_) T{std::forward<Args>(args)...};
            ++size_;
        }
        else {
            data_[head_].~T();
            new (data_ + head_) T{std::forward<Args>(args)...};
        }
        head_ = (head_ + 1) % cap_;
    }

    T& operator[](std::size_t i) noexcept
    {
        std::size_t phys = (head_ + cap_ - size_ + i) % cap_;
        return data_[phys];
    }
private:
    std::size_t cap_;
    std::size_t size_  = 0;
    std::size_t head_  = 0;   // next position to (re)write
    T*          data_  = nullptr;
};

While the behaviour is now not undefined anymore we have introduces silence data loss. Most of the time this is not what we want, but in scenarios where "latest N items are enough", you should choose this solution. Keep in mind that the destructor and iterators among others, must adapt to logical vs. physical order.

Builder Pattern — Make Construction a Separate Phase

The builder pattern pushes the “no-overflow” guarantee all the way into the type system: you do all the mutation in a throw-away builder object and, once you are satisfied, seal the data inside an immutable static_vector that no longer exposes any insertion API. In other words, capacity planning becomes a compile-time property of the result type: if you are holding a static_vector<T>, it is impossible to blow past its storage because there simply is no emplace_back() member to call.

We introduce a static_vector_builder<T> that only supports emplace_back(). When you are done, you call build() which moves the internal storage into a plain static_vector<T> that no longer exposes any mutating inserters.

After the hand-off, capacity overflow is impossible—there is simply no API for it.

template<class T>
class static_vector
{
    friend class static_vector_builder<T>;

public:
    static_vector()                        = delete;
    static_vector(const static_vector&)    = delete;
    static_vector& operator=(const static_vector&) = delete;

    constexpr static_vector(static_vector&& other) noexcept
        : cap_(other.cap_),
          size_(other.size_),
          data_(std::exchange(other.data_, nullptr))
    {}

    constexpr ~static_vector()
    {
        std::destroy_n(data_, size_);
        std::allocator<T>{}.deallocate(data_, cap_);
    }

    template<class Self>
    constexpr auto operator[](this Self&& self, std::size_t idx) noexcept
        -> decltype(auto)
    {
        return std::forward<Self>(self).data_[idx];
    }
    // NO emplace_back(...)!
private:
    // private ctor used by builder
    constexpr static_vector(std::size_t c, std::size_t s, T* d) noexcept
        : cap_(c), size_(s), data_(d) {}

    std::size_t cap_, size_;
    T* data_;
};

template<class T>
class static_vector_builder
{
public:
    explicit static_vector_builder(std::size_t capacity) : cap_(capacity)
    {
        data_ = static_cast<T*>(std::allocator<T>{}.allocate(capacity));
    }

    ~static_vector_builder()
    {
        // If build() never happened, destroy constructed elements
        if (data_ != nullptr) {
            std::destroy_n(data_, size_);
        }
    }
    static_vector_builder(const static_vector_builder&)            = delete;
    static_vector_builder& operator=(const static_vector_builder&) = delete;

    template<class... Args>
    void emplace_back(Args&&... args)
    {
        new (data_ + size_) T{std::forward<Args>(args)...};
        ++size_;
    }

    // ——— The magic hand-off ———
    auto build() && -> static_vector<T>
    {
        return {cap_, size_, std::exchange(data_, nullptr)};
    }

private:
    std::size_t cap_, size_ = 0;
    T* data_ = nullptr;
};

Here's a concrete example of how you would use the builder pattern in practice, clearly separating the mutable building phase from the immutable, finalized state:

// Phase 1 – build
static_vector_builder<std::mutex> b(8);

for (auto i = 0u; i < 8; ++i) {
    b.emplace_back();
}

// Phase 2 – freeze
auto mutexes = std::move(b).build();
// mutexes.emplace_back(...)  ← does not compile

mutexes[3].lock();

The fill-then-freeze builder pattern turns capacity planning into a compile-time guarantee: once you call build(), the container is locked, overflow is impossible, and every reference stays valid—ideal for hot paths, concurrency, and code clarity. The trade-off is rigidity: late inserts now mean starting a new builder and override your old static_vector. If your data really is fixed after initialization, the safety and performance wins generally outweigh that modest cost.

Benchmarks

To demonstrate the effectiveness of static_vector compared to the commonly used alternative std::vector<std::unique_ptr<T>>, we performed some benchmarks focusing on three key operations:

Creation and Destruction: Measures the overhead of constructing and destructing the containers.
Iteration: Evaluates the performance impact of sequentially accessing each element.
Random Access: Tests accessing elements at random indices.

The benchmark implementation is openly available on our GitHub.

Benchmark Setup

The benchmarks were conducted on the following hardware and software environment:

AMD Ryzen 7 PRO 5850U
Linux Kernel 6.6.85-2
GCC-14.2.1 with -O3 -march=native -std=c++23
100 iterations to ensure statistical reliability

Results

The plot below, captured from benchmarks executed on an AMD Ryzen 7 PRO 5850U, visually highlights the performance advantages of static_vector: The results clearly indicate that static_vector significantly outperforms std::vector<std::unique_ptr<T>> across all tested scenarios.

Benchmark comparison chart: static_vector vs vector<unique_ptr> performance for Create+Destroy, Iterate, and Random Access on AMD Ryzen 7 PRO 5850U

In summary, the benchmarks affirm that static_vector is a superior container choice for scenarios involving non-movable and non-copyable types, achieving an overall better performance.

Conclusion

Using static_vector for non-copyable and non-movable types significantly improves memory locality, avoids costly allocations, and ensures safer semantics compared to typical solutions such as std::vector<std::unique_ptr<T>>. By choosing the appropriate overflow handling strategy, you can further tailor the container to your application’s specific needs. Ultimately, static_vector offers a powerful alternative in performance-sensitive C++ code.