3

I am using python and pandas for my assignment. my Datafrmae looks something like this:

Date Time Business hours
2021/4/26 0800 NO
2021/4/26 0900 Yes
2021/4/26 1000 Yes

I want to figure out if the date was a holiday by counting how many yes and no there are on a specific date - if the count of yes is less then 7, then I would deem that date a Holiday and exclude it from my calculations by deleting it.

I was thinking of adding a holiday column with boolean value. Been looking all over for a solution online but falling short. Im pretty new to Python so i apologise if i said anything stupid

elzzup
  • 63
  • 6

2 Answers2

2

We can use transform here with groupby:

s = df["Business hours"].eq("Yes").groupby(df["Date"]).transform("Sum")
df[s >= 7]
Erfan
  • 40,971
  • 8
  • 66
  • 78
  • Thank you for the help! Im just having an issue with the code. I still want to keep the dates and times where business hours == no. Just want to remove the business hours that are essentially holidays. Could you point me in the direction of getting that to work? – elzzup May 06 '21 at 21:12
  • Have a look [here](https://stackoverflow.com/questions/29688899/pandas-checking-if-a-date-is-a-holiday-and-assigning-boolean-value) – Erfan May 06 '21 at 21:14
  • my holidays are not calender ones, they are dependent on how many 'yes' in a day – elzzup May 07 '21 at 17:47
  • I figured it out. Thought I would share how i did it incase anyone else needs it. I made a condition in a variable 'b = df[(s < 14) & (df['Bussiness Hours'] == 'Yes')].index' then just droped the rows so 'df.drop(b, inplace=True )' – elzzup May 07 '21 at 18:47
1

Try groupby filter function :

def filter_rows(x):
    try:
        x['Business hours'].value_counts()['Yes'] >= 7
        return True
    except KeyError as e:
        return False
df = df.groupby('Date').filter(filter_rows)
Nk03
  • 14,699
  • 2
  • 8
  • 22