Lets say I have a dataframe with a multiindex, constructed as follows:
import numpy as np
import pandas as pd
ids = ['a', 'b', 'c']
hours = np.arange(24)
data = np.random.random((len(ids),len(hours)))
df = pd.concat([pd.DataFrame(index = [[id]*len(hours), hours], data = {'value':data[ind]}) for ind, id in enumerate(ids)])
df.index.names = ['ID', 'hour']
Which looks like this:
            value
ID hour          
a  0     0.020479
   1     0.059987
   2     0.053100
   3     0.406198
   4     0.452231
          ...
c  19    0.150493
   20    0.617098
   21    0.377062
   22    0.196807
   23    0.954401
What I want to do is get a new 24-hour timeseries for each station, but calculated with a 5-hour rolling average.
I know I can do something like df.rolling(5, center = True, on = 'hour'), but the problem with this is that it doesn't take into account the fact that the hours are cyclical - i.e., the rolling average for hour 0 should be the average of hours 22, 23, 0, 1, and 2.
What is a good way to do this?
Thanks!
 
     
    