I'm trying to make some regular time series with ffill with Pandas, but I'm getting a non-unique index error.
Here's my code:
for d in data_types:
    series = df[df['datatype'] == d]['measurementvalue'].values
    times = df[df['datatype'] == d]['displaydate'].values
    data_series = pd.Series(series, index = times)
    data_series.drop_duplicates(inplace = True)
    data_series.asfreq('30Min', method = 'ffill')
    all_series.append(data_series)
I'm getting the following error as a result of the asfreq call for one particular data_type:
ValueError: cannot reindex a non-unique index with a method or limit
This is for a set of data where drop_duplicates results in a length drop from 2119 to 1299, suggesting it's the densest (time wise) value.
==========
EDIT
I did some poking around and have narrowed down the issue by taking time lags to the nearest second in the indices, I can see the 'duplicate' indices that are created when two rows fall into the same second. My guess is that these are the offending rows...
2016-03-02 04:03:29.693    8.250347
2016-03-02 04:03:29.693    7.478983
2016-03-06 00:19:30.183    45.97248
2016-03-06 00:19:30.183    24.06088
2016-03-14 02:44:58.783    9.169300
2016-03-14 02:44:58.783    4.221998
2016-03-18 21:54:20.097    73.80586
2016-03-24 16:41:19.825    3.608202
2016-03-24 16:41:19.825    3.887996
2016-03-25 03:35:57.197    4.974968
2016-03-25 03:35:57.197    5.638140
2016-04-02 11:18:27.290    7.923712
2016-04-02 11:18:27.290    6.143240
2016-04-10 19:59:54.677     3.143636
2016-04-10 19:59:54.686    14.222390
What's the best way to drop a value? Let's say I want to write a custom method that sends me all the duplicate values for a given index value and sends back the single values that should be used for that index value. How can I do that?
 
     
     
    