I have a dataframe as shown below:
col1 = ['a','b','c','a','c','a','b','c','a']
col2 = [1,1,0,1,1,0,1,1,0]
df2 = pd.DataFrame(zip(col1,col2),columns=['name','count'])
    name    count
0   a       1
1   b       1
2   c       0
3   a       1
4   c       1
5   a       0
6   b       1
7   c       1
8   a       0
I am trying to find the ratio of the number of zeros to the sum of zeros+ones corresponding to each element in the "name" column. Firstly i aggreated the counts as follows:
for j in df2.name.unique():
    print(j)
    zero_ct = zero_one_frequencies[zero_one_frequencies['name'] == j][0]
    full_ct = zero_one_frequencies[zero_one_frequencies['name'] == j][0] + zero_one_frequencies[zero_one_frequencies['name'] == j][1]
    zero_pb = zero_ct / full_ct
    one_pb = 1 - zero_pb
    print(f"ZERO rations for {j} = {zero_pb}")
    print(f"One ratios for {j} = {one_pb}")
    print("="*30)
And the output looks like:
a
ZERO ratios for a = 0    0.5
dtype: float64
One ratios for a = 0    0.5
dtype: float64
==============================
b
ZERO ratios for b = 1    0.0
dtype: float64
One ratios for b = 1    1.0
dtype: float64
==============================
c
ZERO ratios for c = 2    0.333333
dtype: float64
One ratios for c = 2    0.666667
dtype: float64
==============================
My goal is to add 2 new columns to the dataframe: "name_0" and "name_1" with th ratio values for each element in the "name" column. I tried something but its not giving the expected results:
for j in df2.name.unique():
    print(j)
    zero_ct = zero_one_frequencies[zero_one_frequencies['name'] == j][0]
    full_ct = zero_one_frequencies[zero_one_frequencies['name'] == j][0] + zero_one_frequencies[zero_one_frequencies['name'] == j][1]
    zero_pb = zero_ct / full_ct
    one_pb = 1 - zero_pb
    print(f"ZERO Probablitliy for {j} = {zero_pb}")
    print(f"One Probablitliy for {j} = {one_pb}")
    print("="*30)
    
    condition1 = [ df2['name'].eq(j) & df2['count'].eq(0)]
    condition2 = [ df2['name'].eq(j) & df2['count'].eq(1)]
    choice1 = zero_pb.tolist()
    choice2 = one_pb.tolist()
    print(f'choice1 = {choice1}, choice2 = {choice2}')
    df2["name"+str("_0")] = np.select(condition1, choice1, default=0)
    df2["name"+str("_1")] = np.select(condition2, choice2, default=0)
The column is updated with the values of the name element 'c'. It's to be expected as the last computed values are being used to update all the values.
Is there another way to use the np.select effectively?
Expected output:
    name    count   name_0      name_1
0   a       1       0.000000    0.500000
1   b       1       0.000000    1.000000
2   c       0       0.333333    0.000000
3   a       1       0.000000    0.500000
4   c       1       0.000000    0.666667
5   a       0       0.500000    0.000000
6   b       1       0.000000    1.000000
7   c       1       0.000000    0.666667
8   a       0       0.500000    0.000000