I have got some data(42 features) collected from people during some months(maximum - 6; varies for different entries), every month's value is represented in its own row:
There are 9267 unique ID values(set as index) and as many as 50 000 rows in the df. I want to convert it to 42 * 6 feature vectors for each ID(even though some will have a lot of NaNs there), so that i can train on them, here is how it should look like:
Here is my solution:
def flatten_features(f_matrix, ID):
    '''constructs a 1x(6*n) vector from  6xn matrix'''
    #check wether it is a series, not dataframe
    if(len(f_matrix.shape) == 1): 
        f_matrix['ID'] = ID
        return f_matrix
    flattened_vector = f_matrix.iloc[0]
    for i in range(1, f_matrix.shape[0]):
        vector_append = f_matrix.iloc[i]
        vector_append.index = (lambda month, series_names : series_names.map(lambda name : name + '_' + str(month)))\
                                (i, vector_append.index)
        flattened_vector = flattened_vector.append(vector_append)
    flattened_vector['ID'] = ID
    return flattened_vector
#construct dataframe of flattened vectors for numerical features
new_indices = flatten_features(numerical_f.iloc[:6], 1).index
new_indices
flattened_num_f = pd.DataFrame(columns=new_indices)
flattened_num_f
for label in numerical_f.index.unique():
    matr = numerical_f.loc[label]
    flattened_num_f = flattened_num_f.append(flatten_features(matr, label))
It yields needed results, however it runs very slow. I wonder, is there a more elegant and fast solution?


 
    