I'm pivoting a rather large dataframe of shape (10_000_000, 678) into one of approx. shape (770_000, 8_789) to create a dataset for an ML algorithm. It's a relatively slow operation taking about half an hour on a high-ram cluster I am using, and I'm wondering if there is a way to speed it up. Here is a minimum example, with a larger one below:
import polars as pl
import numpy as np
data = {
    "id": [1,1,1,2,2,2,3,3,3],
    "rank": [1,2,3,1,2,3,1,2,3], # rank is always repeating 1-3 (or 0-12 in large example)
    "A": np.random.random((9)),
    "B": np.random.random((9)),
}
df = pl.DataFrame(data)
df_pivot = df.pivot(values=["A", "B"], index="id", columns="rank")
# Now rename columns, since they are currently:
# df_pivot.columns
# ['id', '1', '2', '3', '1', '2', '3']
ranks = [1,2,3]
renamed_columns = df_pivot.columns[:1]
for col in df.columns[2:]:
    for rank in ranks:
        renamed_columns.append(f"{col}_{rank}")
df_pivot.columns = renamed_columns
# df_pivot
shape: (3, 7)
┌─────┬──────────┬──────────┬──────────┬──────────┬──────────┬──────────┐
│ id  ┆ A_1      ┆ A_2      ┆ A_3      ┆ B_1      ┆ B_2      ┆ B_3      │
│ --- ┆ ---      ┆ ---      ┆ ---      ┆ ---      ┆ ---      ┆ ---      │
│ i64 ┆ f64      ┆ f64      ┆ f64      ┆ f64      ┆ f64      ┆ f64      │
╞═════╪══════════╪══════════╪══════════╪══════════╪══════════╪══════════╡
│ 1   ┆ 0.867957 ┆ 0.854234 ┆ 0.408062 ┆ 0.076254 ┆ 0.899092 ┆ 0.059019 │
├╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┤
│ 2   ┆ 0.642296 ┆ 0.670476 ┆ 0.480494 ┆ 0.4254   ┆ 0.536173 ┆ 0.492312 │
├╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┤
│ 3   ┆ 0.778481 ┆ 0.151697 ┆ 0.330138 ┆ 0.6661   ┆ 0.4086   ┆ 0.992057 │
└─────┴──────────┴──────────┴──────────┴──────────┴──────────┴──────────┘
The polars pivot code states that in a comment:
Polars lazy does not implement a pivot because it is impossible to know the schema without materializing the whole dataset. This makes a pivot quite a terrible operation for performant workflows. An optimization can never be pushed down passed a pivot.
And in the groupby.pivot code:
Polars'/arrow memory is not ideal for transposing operations like pivots. If you have a relatively large table, consider using a groupby over a pivot.
Some questions:
- Is it possible to replace the above pivot example by a (preferably lazy) combination of groupby and something else? This SO post about pandas suggests an equivalency of groupby + "unstack" with pivot. Polars does not implement an unstack function, afaik. 
- Is the above suggestion more performant than the current pivot implementation? (See the larger example below). 
- I actually do know the schema ahead-of-schedule, since in my situation - rankis a known series ([1, 2, 3] in the example). If implemented, would a lazy pivot where one can supply the schema be more performant than the eager one?
- Should I be implementing it differently? 
# Much larger example, but with 10_000 rows instead of 10_000_000
# 10_000 runs in 3 seconds, 100_000 runs in 40 seconds (M1 macbook)
from string import ascii_lowercase
import polars as pl
import numpy as np
ranks = np.arange(13)
N_ROWS = 10_000 # this could be ~10_000_000
df = (pl.DataFrame({"ID": np.arange(N_ROWS)})).join(
    pl.DataFrame({"rank": ranks}), how="cross"
)
# create 26**2 dummy column names
column_names = []
for letter1 in ascii_lowercase:
    for letter2 in ascii_lowercase:
        column_names.append(letter1 + letter2)
# stack frames to create: ID, ranks, aa, ab, ..., zz
df = df.hstack(
    pl.DataFrame({letter: np.random.random(len(df)) for letter in column_names})
    )
df_pivot = df.pivot(values=df.columns[2:], index="ID", columns="rank")
renamed_columns = df_pivot.columns[:1]
for col in df.columns[2:]:
    for rank in ranks:
        renamed_columns.append(f"{col}_{rank}")
df_pivot.columns = renamed_columns
