I'm trying to use the excellent uproot and awkward-array to read some analysis data stored in a TTree. I understand that ROOT doesn't write nested vectors (ie. std::vector<std::vector<int>>) in a columnar format, but following this discussion, I modified my tree output to contain two separate branches: one std::vector<int> with the content, and one std::vector<int> with the offsets. The contents vector has values pushed into it multiple times between filling the tree. Each time it has values pushed in, the size of the contents vector is stored in the offsets.
My idea was that I would recreate the structure that I need via a nested JaggedArray when I read the tree. However, reading through the awkward-array documentation, I can't seem to figure out the right way to construct this nested JaggedArray without looping in python. fromoffsets requires a 1D index, which means that the jagged indices must be flattened, which then loses their structure. None of the other classmethods seem to fit. The example below uses a generator, which I think will be rather slow due to looping in python. Is there a better way to construct the JaggedArray? Or a better way to store the data in the tree?
import awkward as ak
all_jagged_indices = ak.fromiter([[0, 1, 4], [0, 1, 2, 3]])
all_constituents = ak.fromiter([[12, 14, 3, 4], [2, 8, 3]])
output = ak.fromiter(
(ak.JaggedArray.fromoffsets(jagged_indices, constituents)
for jagged_indices, constituents in
zip(all_jagged_indices, all_constituents))
)
expected = ak.fromiter([[[12], [14, 3, 4]], [[2], [8], [3]]])
assert (output == expected).all().all().all()
Thanks!