I have 2 pandas dataframes, df_pe and df_merged. Both the dataframes have several rows, as well as several columns. Now, there are some specific things I would like to accomplish using these dataframes:
- In
df_merged, there is a column namedST, which contains timestamps of various events in format eg.(2017-08-27 00:00:00). Indf_pe, there are 2 columnsTonandToffwhich contain the time when an event started and when and event ended. Eg. (Tonvalue for a random row:2018-08-17 01:20:00whileToffvalue2018-08-17 02:30:00). Secondly, there is a column in
df_pe, namelyEC. I have another dataframe calleddf_uniqueal, which also has a column calledEC. What I would like to do is:a. For all rows in
df_merged, whenever the ST value is within the duration ofTonandToffin thedf_pe, create 2 new columns indf_merged:ECandED. Append/Put the value of theECfrom data framedf_peinto this new column, while, put the value of the dataframedf_uniquealinto the new columnED(which is eventually a mapped version of theECindf_pefor obtainingEDindf_uniqueal). If none of the conditions matches/there areNaNs (missing values) left after this procedure, put the string"NF"intodf_merged's newEDcolumn, while put the integer0into thedf_merged's newECcolumn.
I have explored SO and SE, but have not found anything substantial. Any help in this regard is highly appreciated.
This is my attempt at using for loops in Python for iterating over the dataframes for accomplishing the first condition but it runs forever (never ending) and I don't think this is the best possible way to accomplish this.
for i in range(len(df_merged)):
for j in range(len(df_pe)):
if df_pe.TOn[j] < df_merged.ST[i] < df_pe.TOff[j]:
df_merged.EC[i] = df_pe.EC[j]
df_merged.ED[i] = df_uniqueal.ED[df_processed.EC[j]]
else:
df_merged.EC[i] = 0
df_merged.ED[i] = "NF"
EDIT
Please refer image for expected output and baby example of dataframes.
The relevant columns are in bold (note the column numbers may differ, but the column names are same in this sample example).