3

I am currently refactoring an imperative C++ program that uses AVX2 primitives extensively to a well-structured class-based program. Unfortunately, I encounter an segfault when assigning to a class member with an AVX2 datatype.

I'm within WSL using:

gcc version 6.3.0 20170516 (Debian 6.3.0-18+deb9u1)

Compiling with flags:

g++ -mavx2 -g minimal.cpp

A minimal code sample to reproduce the segfault is:

#include <immintrin.h>

class MyClass
{
    public:
        MyClass(int* arr);
        __m256i value;
};

MyClass::MyClass(int* arr){
    this->value = _mm256_set_epi32(arr[0], arr[1], arr[2], arr[3], arr[4], arr[5], arr[6], arr[7]);
}

int main(){
    int arr[8] = {0x0,0x1,0x2,0x3,0x4,0x5,0x6,0x7};
    MyClass* m = new MyClass(arr);
}

GDB output:

Program received signal SIGSEGV, Segmentation fault.
0x00000000080007cf in MyClass::MyClass (this=0x8413e70, arr=0x7ffffffedd90) at minimal.cpp:11
11          this->value = _mm256_set_epi32(arr[0], arr[1], arr[2], arr[3], arr[4], arr[5], arr[6], arr[7]);

I already attempted assigning the class member after the constructor, same segfault.

Update: This is a related question, however it is not a duplicate.(Here: Focus on class members, relation to "new" only became apparent after initial question)

  • 2
    `new` doesn't respect alignment requirements greater than `alignof(maxalign_t)`, except maybe in C++17. – Peter Cordes Apr 20 '19 at 23:47
  • Interesting. Then I understand the error, thank you. Any suggestion for a workaround? – BufferFluffer Apr 21 '19 at 00:13
  • Use _mm_alloc or *aligned_malloc either with placement new or by replacing operator new and friends for your class. See https://stackoverflow.com/questions/32612881/why-use-mm-malloc-as-opposed-to-aligned-malloc-alligned-alloc-or-posix-mem – Mike Vine Apr 21 '19 at 00:15
  • @BufferFluffer: Did you try `gcc -std=gnu++17` (along with `-march=native -g -O3` and whatever)? – Peter Cordes Apr 21 '19 at 02:17
  • 1
    @PeterCordes I tested this (or almost this) before. Unfortunately, [it seems](https://godbolt.org/z/GzsaPi) that GCC 6.3, while supporting a C++17 switch, will not actually generate a call to an allocation function with the correct alignment (thus, being non-conforming in this regard, as doing so [would be mandated by C++17](https://timsong-cpp.github.io/cppwp/n4659/expr.new#14)). Starting with GCC 7.x, it seems to be working correctly… – Michael Kenzel Apr 21 '19 at 02:30
  • @MichaelKenzel: thanks, it wasn't 100% clear from your answer if you meant that, or if you meant that `-faligned-new` *without* `-std=gnu++17` was insufficient. I guessed the former, but yeah I guess the OP needs a newer compiler. gcc6.3 is "new enough" for most things. gcc8.2.1 on Arch Linux does use aligned new: `operator new(unsigned long, std::align_val_t)` (which I can see from a link error if I compile with `gcc` instead of `g++` so it doesn't link libstdc++.) So I know it's not just a coincidence that it happened to pick an aligned address and work. – Peter Cordes Apr 21 '19 at 02:38
  • @PeterCordes yes, I just added a link to my answer, it seems that GCC officially supports C++17 new-extended alignment starting in version 7. If possible, upgrading the compiler would certainly seem to be the most ideal solution… – Michael Kenzel Apr 21 '19 at 02:45
  • Possible duplicate of [Segmentation fault (core dumped) when using avx on an array allocated with new\[\]](https://stackoverflow.com/questions/55566275/segmentation-fault-core-dumped-when-using-avx-on-an-array-allocated-with-new) – chtz Apr 21 '19 at 16:46

1 Answers1

4

As has already been mentioned by Peter Cordes in the comments above, the problem here is that new does not respect extended alignment pre C++17. (See [P0035R4] which was adopted in C++17 to make new usable for memory with more than alignof(maxalign_t) alignment).

GCC7 and later supports aligned new with -std=gnu++17 or -std=c++17 (or just -faligned-new). Your code will Just Work™ and automatically pass the required alignment to operator new if you turn on those options.


But older GCC, including your 6.3, does not, so you will have to manually make sure to get memory that's properly aligned. There are a few ways to do so.

_mm_alloc was already mentioned in comments. On GCC, _mm_alloc seems to basically map to posix_memalign, so you could also just use that directly. A portable C++11 solution would be allocate a buffer large enough to accommodate an object of your class plus whatever space is needed for padding in the beginning to ensure proper alignment. You could then use std::align and placement new to construct your object at a suitably-aligned address.

All that being said, no matter which method of allocating properly-aligned memory you choose, I would strongly suggest to encapsulate that stuff by providing allocation and deallocation functions for your class. The alignment requirement is a property of the type itself, it should not be up to the user of your class to know that, due to an implementation detail like the fact that it has a member of type __m256i, any object of type MyClass has extended alignment requirements, which have to be considered whenever such an object is allocated via a new expression. You should either disallow creation of objects of this type via a new expression, or provide the necessary facilities to make the type work correctly with new expressions…

A C++11 solution could look like this:

#include <cstddef>
#include <memory>

#include <immintrin.h>

class MyClass
{
    __m256i value;

public:
    MyClass(const int* arr)
    {
        this->value = _mm256_set_epi32(arr[0], arr[1], arr[2], arr[3], arr[4], arr[5], arr[6], arr[7]);
    }

    void* operator new(std::size_t size)
    {
        return _mm_malloc(size, alignof(MyClass));
    }

    void* operator new[](std::size_t size)
    {
        return _mm_malloc(size, alignof(MyClass));
    }

    void operator delete(void* ptr)
    {
        _mm_free(ptr);
    }

    void operator delete[](void* ptr)
    {
        _mm_free(ptr);
    }
};

int main()
{
    int arr[8] = {0x0,0x1,0x2,0x3,0x4,0x5,0x6,0x7};
    auto m = std::unique_ptr<MyClass> { new MyClass(arr) };
}

live example here

Michael Kenzel
  • 15,508
  • 2
  • 30
  • 39
  • A made an edit which IMO improves the skimability of the early part of the answer and simplifies the presentation of your separate points, especially re: GCC versions. Obviously feel free to roll back or re-edit if you liked your version better. – Peter Cordes Apr 21 '19 at 03:22