I've put this question in quite a bit of context, to hopefully make it easier to understand, but feel free to skip down to the actual question.
Context
Here is the work I was doing which sparked this question:
I'm working with an API to access some tabular data, which is effectively a labelled N-dimensional array. The data is returned as a flattened list of lists (of the actual data values), plus a list of the different axes and their labels, e.g.:
raw_data = [
    ['nrm', 'nrf'],
    ['ngm', 'ngf'],
    ['nbm', 'nbf'],
    ['srm', 'srf'],
    ['sgm', 'sgf'],
    ['sbm', 'sbf'],
    ['erm', 'erf'],
    ['egm', 'egf'],
    ['ebm', 'ebf'],
    ['wrm', 'wrf'],
    ['wgm', 'wgf'],
    ['wbm', 'wbf'],
]
axes = [
    ('Gender', ['Male', 'Female']),
    ('Color', ['Red', 'Green', 'Blue']),
    ('Location', ['North', 'South', 'East', 'West']),
]
The data is normally numeric, but I've used strings here so you can easily see how it matches up with the labels, e.g. nrm is the value for North, Red, Male.
The data loops through axis 0 as you go across (within) a list, and then loops through axes 1 and 2 as you go down the lists, with axis 1 (on the "inside") varying most rapidly, then 2 (and for higher-dimensional data continuing to work "outwards"), viz:
       axis 0 ->
a a [ # # # # # # ]
x x [ # # # # # # ]
i i [ # # # # # # ]
s s [ #  R A W  # ]
    [ # D A T A # ]
2 1 [ # # # # # # ]
↓ ↓ [ # # # # # # ]
    [ # # # # # # ]
I want to reshape this data and match it up with its labels, which I did using the following to output it into a Pandas (multi-index) DataFrame:
import numpy as np
import pandas as pd
names = [name for (name, _) in axes]
labels = [labels for (_, labels) in axes]
sizes = tuple(len(L) for L in labels)  # (2, 3, 4)
data_as_array = np.array(raw_data)  # shape = (12, 2) = (3*4, 2)
A = len(sizes)  # number of axes
new_shape = (*sizes[1:],sizes[0])  # (3, 4, 2)
data = data_as_array.reshape(new_shape, order="F").transpose(A - 1, *range(A - 1))
# With my numbers: data_as_array.reshape((3, 4, 2), order="F").transpose(2, 0, 1)
df = pd.DataFrame(
    data.ravel(),
    index=pd.MultiIndex.from_product(labels, names=names),
    columns=["Value"],
)
(I've noted with comments what some of the particular values are for my example, but the code is meant to be generalised for any N-dimensional data.)
This gives:
                      Value
Gender Color Location      
Male   Red   North      nrm
             South      srm
             East       erm
             West       wrm
       Green North      ngm
             South      sgm
             East       egm
             West       wgm
       Blue  North      nbm
             South      sbm
             East       ebm
             West       wbm
Female Red   North      nrf
             South      srf
             East       erf
             West       wrf
       Green North      ngf
             South      sgf
             East       egf
             West       wgf
       Blue  North      nbf
             South      sbf
             East       ebf
             West       wbf
This is all as desired and expected, and you can see that the values have ended up in the correct places, i.e. attached to their matching labels.
Question
My actual question concerns this line:
data = data_as_array.reshape(new_shape, order="F").transpose(A - 1, *range(A - 1))
which with the specific numbers in my example was:
data = data_as_array.reshape((3, 4, 2), order="F").transpose(2, 0, 1)
After some experimentation, I discovered that all three of the following are equivalent (the first is the original version):
data1 = data_as_array.reshape(new_shape, order="F").transpose(D - 1, *range(D - 1))
data2 = data_as_array.T.reshape(*reversed(new_shape)).T.transpose(D - 1, *range(D - 1))
data3 = data_as_array.reshape(*reversed(sizes)).T
But this got me thinking (and here is my question at last!):
Are there any rules that I could use to manipulate the expression, to get from e.g. data1 to data3?
In particular, it seems like transpose() and reshape() are closely linked and that there might be a way to "absorb" the action of the tranpose into the reshape(), so that you can drop it or at least transform it into a neater .T (as per data3).
My attempt
I managed to establish the following rule:
a.reshape(shape, order="F") == a.T.reshape(*reversed(shape)).T
You can apply .T to both sides, or substitute a.T in for a to get these variations of it:
a.reshape(shape) == a.T.reshape(*reversed(shape), order="F").T
a.reshape(shape).T == a.T.reshape(*reversed(shape), order="F")
a.T.reshape(shape) == a.reshape(*reversed(shape), order="F").T
a.reshape(shape, order="F") == a.T.reshape(*reversed(shape)).T
a.reshape(shape, order="F").T == a.T.reshape(*reversed(shape))
a.T.reshape(shape, order="F") == a.reshape(*reversed(shape)).T
I think this is effectively the definition of the difference between row-major and column-major ordering, and how they relate.
But what I haven't managed to do is show is how you can go from:
data = data_as_array.reshape((3, 4, 2), order="F").transpose(2, 0, 1)
to:
data = data_as_array.reshape((4, 3, 2))
So somehow put the transposition into the reshape.
But I'm not even sure if this is generally true, or is specific to my data or e.g. 3 dimensions.
EDIT: 
To clarify, I'm reasonably happy with how a straight-up .T transpose works, and the rules above cover that. (Note that .T is equivalent to .tranpose(2, 1, 0) for 3 axes, or .tranpose(n-1, n-2, ... 2, 1, 0) for the general case of n axes.)
It's the case of using .transpose() where you're doing a "partial" transpose that I'm curious about, e.g. .tranpose(1, 0, 2) - where you're doing something other than just reversing the order of the axes.
Some references:
- This covers row-major and column-major differences: How do you unroll a Numpy array of (mxn) dimentions into a single vector (and I can quite easily see how that's happening in my data)
- This SO answer is really helpful in explaining transposing, and basically covers reshaping as well: https://stackoverflow.com/a/32034565/9219425 (check out the fantastic diagrams!), including covering how transposing affects shape and strides. I wrote an algorithm mimicking this process to see if that would make things clearer (e.g. transposing might correspond to swapping the order of the forloops in the algorithm), but it didn't really help.
 
    