How can I select the first representative element for each group of a DataFrameGroupBy object?

Question

My question is related with the selection of the first most frequent element for each group of a DataFrameGroupBy object. Specifically, I am having the following dataframe:

data = [
    [1000, 1, 1], [1000, 1, 1], [1000, 1, 1], [1000, 1, 2], [1000, 1, 2],
    [1000, 1, 2], [2000, 0, 1], [2000, 0, 1], [2000, 1, 2],
    [2000, 0, 2], [2000, 1, 2]]
df = pd.DataFrame(data, columns=['route_id', 'direction_id', 'trip_id'])

Then, I group my df based on the columns route_id, direction_id by using:

t_groups = df.groupby(['route_id','direction_id'])

I would like to store the value of the trip_id column based on the first most popular trip_id of each unique route_id, direction_id combination.

Ι have tried to apply a function value_counts() but I cannot get the first popular trip_id value.

I would like my expected output to be like:

   route_id  direction_id  trip_id
0      1000             1        1
1      2000             0        1
2      2000             1        2

Any suggestions?

Is this what you're looking for? [GroupBy pandas DataFrame and select most common value](/q/15222754/4518341). TL;DR `t_groups['trip_id'].agg(pd.Series.mode)` — wjandrea, Dec 02 '22 at 17:07
Whoops, actually, if there's more than one most common value, it'll return both but I thought it'd only return the first. So, `.agg(lambda s: s.mode()[0])`. (If you wanted both, you could do `.apply(pd.Series.mode)`.) — wjandrea, Dec 02 '22 at 17:11
BTW, welcome to Stack Overflow! Check out the [tour], and [ask] if you want tips. It'd help to post the code you already tried, but apart from that, this is a great first question :) — wjandrea, Dec 02 '22 at 17:12
Thank you very much @wjandrea. I will have this in mind for my next questions! — ask_10, Dec 06 '22 at 12:26

score 0 · Answer 1 · edited Dec 02 '22 at 17:28

0

This is what you are looking for.

df = df.groupby(['route_id', 'direction_id']).first().reset_index()

The reset_index() just moves your indices into columns looking exactly like the output you want.

edited Dec 02 '22 at 17:28

wjandrea

28,235
9
60
81

answered Dec 02 '22 at 17:24

Amir nazary

384
1
7

I don't think this is what OP's looking for. It sounds like they're actually looking for the first *most frequent* value. Their example is ambiguous though. – wjandrea Dec 02 '22 at 17:27
That's right @wjandrea – ask_10 Dec 06 '22 at 12:25

How can I select the first representative element for each group of a DataFrameGroupBy object?

1 Answers1