Is it possible to make std::vector of custom structs allocate aligned memory for further processing with SIMD instructions? If it is possible to do with Allocator, does anyone happen to have such an allocator he could share?
 
    
    - 32,368
- 48
- 194
- 335
- 
                    1did you check to see if the standard allocator already does that for you? – TemplateRex Oct 17 '12 at 20:24
- 
                    2@rhalbersma: I don't think it does, it doesn't take alignment parameter. – Violet Giraffe Oct 17 '12 at 20:47
- 
                    1no what I mean is: does your STL implementation already align memory for you? Did you compute the memory address of `v.begin()` and check whether it starts at a multiple of X bytes? even though you can't explicily configure alignment, the std::allocator might already help you with that. – TemplateRex Oct 18 '12 at 05:32
- 
                    @rhalbersma: I believe it aligns on 4B (32 bit) boundary, but I need 128 bit alignment. – Violet Giraffe Oct 18 '12 at 06:28
- 
                    2@VioletGiraffe: more likely it aligns on an 8 byte boundry. – Mooing Duck Oct 18 '12 at 22:05
- 
                    possible duplicate of [How is a vector's data aligned?](http://stackoverflow.com/questions/8456236/how-is-a-vectors-data-aligned) – legends2k Jul 17 '15 at 13:41
- 
                    16Note that with C++17, `std::vector<__m256>` automatically allocates memory with a 32 byte alignment :-) – Marc Glisse May 01 '17 at 07:39
- 
                    @MarcGlisse: good to know, thanks! – Violet Giraffe May 01 '17 at 08:33
- 
                    @MarcGlisse Can you convert your comment to an answer so I can upvote it ? – gansub Apr 15 '19 at 10:26
4 Answers
Edit: I removed the inheritance of std::allocator as suggested by GManNickG and made the alignment parameter a compile time thing.
I recently wrote this piece of code. It's not tested as much as I would like it so go on and report errors. :-)
enum class Alignment : size_t
{
    Normal = sizeof(void*),
    SSE    = 16,
    AVX    = 32,
};
namespace detail {
    void* allocate_aligned_memory(size_t align, size_t size);
    void deallocate_aligned_memory(void* ptr) noexcept;
}
template <typename T, Alignment Align = Alignment::AVX>
class AlignedAllocator;
template <Alignment Align>
class AlignedAllocator<void, Align>
{
public:
    typedef void*             pointer;
    typedef const void*       const_pointer;
    typedef void              value_type;
    template <class U> struct rebind { typedef AlignedAllocator<U, Align> other; };
};
template <typename T, Alignment Align>
class AlignedAllocator
{
public:
    typedef T         value_type;
    typedef T*        pointer;
    typedef const T*  const_pointer;
    typedef T&        reference;
    typedef const T&  const_reference;
    typedef size_t    size_type;
    typedef ptrdiff_t difference_type;
    typedef std::true_type propagate_on_container_move_assignment;
    template <class U>
    struct rebind { typedef AlignedAllocator<U, Align> other; };
public:
    AlignedAllocator() noexcept
    {}
    template <class U>
    AlignedAllocator(const AlignedAllocator<U, Align>&) noexcept
    {}
    size_type
    max_size() const noexcept
    { return (size_type(~0) - size_type(Align)) / sizeof(T); }
    pointer
    address(reference x) const noexcept
    { return std::addressof(x); }
    const_pointer
    address(const_reference x) const noexcept
    { return std::addressof(x); }
    pointer
    allocate(size_type n, typename AlignedAllocator<void, Align>::const_pointer = 0)
    {
        const size_type alignment = static_cast<size_type>( Align );
        void* ptr = detail::allocate_aligned_memory(alignment , n * sizeof(T));
        if (ptr == nullptr) {
            throw std::bad_alloc();
        }
        return reinterpret_cast<pointer>(ptr);
    }
    void
    deallocate(pointer p, size_type) noexcept
    { return detail::deallocate_aligned_memory(p); }
    template <class U, class ...Args>
    void
    construct(U* p, Args&&... args)
    { ::new(reinterpret_cast<void*>(p)) U(std::forward<Args>(args)...); }
    void
    destroy(pointer p)
    { p->~T(); }
};
template <typename T, Alignment Align>
class AlignedAllocator<const T, Align>
{
public:
    typedef T         value_type;
    typedef const T*  pointer;
    typedef const T*  const_pointer;
    typedef const T&  reference;
    typedef const T&  const_reference;
    typedef size_t    size_type;
    typedef ptrdiff_t difference_type;
    typedef std::true_type propagate_on_container_move_assignment;
    template <class U>
    struct rebind { typedef AlignedAllocator<U, Align> other; };
public:
    AlignedAllocator() noexcept
    {}
    template <class U>
    AlignedAllocator(const AlignedAllocator<U, Align>&) noexcept
    {}
    size_type
    max_size() const noexcept
    { return (size_type(~0) - size_type(Align)) / sizeof(T); }
    const_pointer
    address(const_reference x) const noexcept
    { return std::addressof(x); }
    pointer
    allocate(size_type n, typename AlignedAllocator<void, Align>::const_pointer = 0)
    {
        const size_type alignment = static_cast<size_type>( Align );
        void* ptr = detail::allocate_aligned_memory(alignment , n * sizeof(T));
        if (ptr == nullptr) {
            throw std::bad_alloc();
        }
        return reinterpret_cast<pointer>(ptr);
    }
    void
    deallocate(pointer p, size_type) noexcept
    { return detail::deallocate_aligned_memory(p); }
    template <class U, class ...Args>
    void
    construct(U* p, Args&&... args)
    { ::new(reinterpret_cast<void*>(p)) U(std::forward<Args>(args)...); }
    void
    destroy(pointer p)
    { p->~T(); }
};
template <typename T, Alignment TAlign, typename U, Alignment UAlign>
inline
bool
operator== (const AlignedAllocator<T,TAlign>&, const AlignedAllocator<U, UAlign>&) noexcept
{ return TAlign == UAlign; }
template <typename T, Alignment TAlign, typename U, Alignment UAlign>
inline
bool
operator!= (const AlignedAllocator<T,TAlign>&, const AlignedAllocator<U, UAlign>&) noexcept
{ return TAlign != UAlign; }
The implementation for the actual allocate calls is posix only but you can extent that easily.
void*
detail::allocate_aligned_memory(size_t align, size_t size)
{
    assert(align >= sizeof(void*));
    assert(nail::is_power_of_two(align));
    if (size == 0) {
        return nullptr;
    }
    void* ptr = nullptr;
    int rc = posix_memalign(&ptr, align, size);
    if (rc != 0) {
        return nullptr;
    }
    return ptr;
}
void
detail::deallocate_aligned_memory(void *ptr) noexcept
{
    return free(ptr);
}
Needs C++11, btw.
 
    
    - 1,746
- 11
- 14
- 
                    I don't think you need to or should inherit from`std::exception<>``std::allocator<>`. – GManNickG Oct 17 '12 at 20:30
- 
                    
- 
                    
- 
                    
- 
                    1The old one was better, there's no way I can compile this in VS 2010 :) – Violet Giraffe Oct 20 '12 at 06:25
- 
                    I used the std::allocator as a reference but removed most of the ifdefs and stuff... Might be it only works with clang... – znkr Oct 20 '12 at 11:20
- 
                    1
- 
                    @shoosh: That's not hard to implement. E. g. http://graphics.stanford.edu/~seander/bithacks.html#DetermineIfPowerOf2 – Violet Giraffe Apr 24 '16 at 14:04
In the upcoming version 1.56, the Boost library will include Boost.Align. Among other memory alignment helpers it provides boost::alignment::aligned_allocator, which can be used a drop-in replacement for std::allocator and allows you to specify an alignment. See the documentation on https://boostorg.github.io/align/
 
    
    - 231
- 2
- 4
- 
                    3It's good to know, but personally I find `boost` quite a pain to integrate into my projects (those libraries that are not header-only). – Violet Giraffe Jun 23 '14 at 13:13
- 
                    9I agree, integrating boost can be a bit of a pain. However, `Boost.Align` _is_ header-only and also only depends on other header-only libraries AFAICS. – tklauser Jun 24 '14 at 09:59
- 
                    2It is now available: http://www.boost.org/doc/libs/1_56_0/libs/core/doc/html/index.html – Moncef M. Sep 09 '14 at 13:07
Starting in C++17, just use std::vector<__m256i> or with any other aligned type. There's aligned version of operator new, it is used by std::allocator for aligned types (as well as by plain new-expression, so new __m256i[N] is also safe starting in C++17).
There's a comment by @MarcGlisse saying this, making this an answer to make it more visible.
 
    
    - 12,039
- 2
- 34
- 79
Yes, it should be possible. If you put this question on google then you will get lots of sample code, below is some promising results:
 
    
    - 48,511
- 9
- 79
- 100