I am trying bin categorical columns programtically - any idea on how I can achieve this without manually hard-coding each value in that column
Essentially, what I would like is a function whereby it counts all values up to 80% [leaves the city name as is] and replaces the remaining 20% of city names with the word 'Other'
IE: if the first 17 city names make up 80% of that column, keep the city name as is, else return 'other'.
EG:
0        Brighton
1        Yokohama
2           Levin
3       Melbourne
4     Coffeyville
5       Whakatane
6       Melbourne
7       Melbourne
8           Levin
9         Ashburn
10     Te Awamutu
11        Bishkek
12      Melbourne
13      Whanganui
14    Coffeyville
15       New York
16       Brisbane
17      Greymouth
18       Brisbane
19      Chuo City
20          Accra
21          Levin
22        Waiouru
23       Brisbane
24       New York
25      Chuo City
26        Lucerne
27      Whanganui
28    Los Angeles
29      Melbourne
df['city'].head(30).value_counts(ascending=False, normalize=True)*100
Melbourne      16.666667
Levin          10.000000
Brisbane       10.000000
Whanganui       6.666667
Coffeyville     6.666667
New York        6.666667
Chuo City       6.666667
Waiouru         3.333333
Greymouth       3.333333
Te Awamutu      3.333333
Bishkek         3.333333
Lucerne         3.333333
Ashburn         3.333333
Yokohama        3.333333
Whakatane       3.333333
Accra           3.333333
Brighton        3.333333
Los Angeles     3.333333
From Ashburn down - it should be renamed to 'other'
I have tried the below which is a start, but not exactly what I want:
city_map = dict(df['city'].value_counts(ascending=False, normalize=True)*100)
df['city_count']= df['city'].map(city_map)
def count(df):
    if df["city_count"] > 10:
        return "High"
    elif df["city_count"] < 0:
        return "Medium"
    else:
        return "Low"
df.apply(count, axis=1)
I'm not expecting any code - just some guidance on where to start or ideas on how I can achieve this