So, I'm a bit confused. I need to be able to create many (up to a billion) small (<8 elements) fixed-sized arrays at runtime. I was using std::vector, but the memory overhead was too high (24 bytes per array). 
I decided to implement my own array, templated on the contained type (T1) and the type storing the static size of the array (T2).
The class is called lw_vector. If I call:
sizeof(lw_vector<int, unsigned char>) 
I get 16. However, as I calculate it, the size should be sizeof(int*)+sizeof(unsigned char)=9 on a 64-bit system. What am I doing wrong here? Is it possible to create an object that only requires 9 bytes of memory?
Here's the implementation of lw_vector:
template<class T1, class T2>
class lw_vector
{
private:
public:
    T2 N;
    T1 * content;
    lw_vector() : N(0)
    {}
    lw_vector(T2 Nt) : N(Nt)
    {
        content = new T1[N];
    }
    ~lw_vector()
    {
        delete[] content;
    }
    lw_vector& operator=(lw_vector temp)
    {
        std::swap(N, temp.N);
        std::swap(content, temp.content);
        return *this;
    }
    lw_vector(const lw_vector& other)
        : N(other.N),
        content(new T1[other.N])
    {
        std::copy(other.content, other.content + N, content);
    }
    inline T1& operator[](T2 ind) const
    {
        return content[ind];
    }
};
Thank you for your time!
John
 
    