Say I need to have data stored as follows:
[[[{}][{}]]]
or a list of lists of two lists of dictionaries
where:
{}: dictionaries containing data from individual frames observing an event. (There are two observers/stations, hence two dictionaries.)
[{}][{}]: two lists of all the individual frames related to a single event, one from each observer/station.
[[{}][{}]]: list of all events on a single night of observation.
[[[{}][{}]]]: list of all nights.
Hopefully that's clear. What I want to do is create two pandas dataframes where all dictionaries from station_1 are stored in one, and all dictionaries from station_2 are stored in the other.
My current method is as follows (where data is the above data structure):
for night in range(len(data)):
station_1 = pd.DataFrame(data[night][0])
station_2 = pd.DataFrame(data[night][1])
all_station_1.append(station_1)
all_station_2.append(station_2)
all_station_1 = pd.concat(all_station_1)
all_station_2 = pd.concat(all_station_2)
My understanding though is that the for loop must be horribly inefficient since I will be scaling the application of this script way up from my sample dataset this cost could easily become unmanageable.
So, any advice for a smarter way of proceeding would be appreciated! I feel like pandas is so user friendly there's gotta be an efficient way of dealing with any kind of data structure but I haven't been able to find it on my own yet. Thanks!
