From my understanding, there are two ways to subset a dataframe in pandas:
a) df['columns']['rows']
b) df.loc['rows', 'columns']
I was following a guided case study, where the instruction was to select the first and last n rows of a column in a dataframe. The solution used Method A, whereas I tried Method B.
My method wasn't working and I couldn't for the life of me figure out why.
I've created a simplified version of the dataframe...
male = [6, 14, 12, 13, 21, 14, 14, 14, 14, 18]
female = [9, 11, 6, 10, 11, 13, 12, 11, 9, 11]
df = pd.DataFrame({'Male': male,
                    'Female': female}, 
                    index = np.arange(1, 11))
df['Mean'] = df[['Male', 'Female']].mean(axis = 1).round(1)
df
Selecting the first two rows, works fine for method a and b
print('Method A: \n', df['Mean'][:2])
print('Method B: \n', df.loc[:2, 'Mean'])
Method A: 
1     7.5
2    12.5
Method B: 
1     7.5
2    12.5
But not for selecting the last 2 rows, it doesn't work the same. Method A returns the last two rows as it should. Method B (.loc) doesn't, it returns the whole dataframe. Why is this and how do I fix it?
print('Method A: \n', df['Mean'][-2:])
print('Method B: \n', df.loc[-2:, 'Mean'])
Method A: 
9     11.5
10    14.5
Method B: 
1      7.5
2     12.5
3      9.0
4     11.5
5     16.0
6     13.5
7     13.0
8     12.5
9     11.5
10    14.5
 
     
    