In pandas.DataFrame.groupby, there is an argument group_keys, which I gather is supposed to do something relating to how group keys are included in the dataframe subsets. According to the documentation:
group_keys : boolean, default True
When calling apply, add group keys to index to identify pieces
However, I can't really find any examples where group_keys makes an actual difference:
import pandas as pd
df = pd.DataFrame([[0, 1, 3],
                   [3, 1, 1],
                   [3, 0, 0],
                   [2, 3, 3],
                   [2, 1, 0]], columns=list('xyz'))
gby = df.groupby('x')
gby_k = df.groupby('x', group_keys=False)
It doesn't make a difference in the output of apply:
ap = gby.apply(pd.DataFrame.sum)
#    x  y  z
# x         
# 0  0  1  3
# 2  4  4  3
# 3  6  1  1
ap_k = gby_k.apply(pd.DataFrame.sum)
#    x  y  z
# x         
# 0  0  1  3
# 2  4  4  3
# 3  6  1  1
And even if you print out the grouped subsets as you go, the results are still identical:
def printer_func(x):
    print(x)
    return x
print('gby')
print('--------------')
gby.apply(printer_func)
print('--------------')
print('gby_k')
print('--------------')
gby_k.apply(printer_func)
print('--------------')
# gby
# --------------
#    x  y  z
# 0  0  1  3
#    x  y  z
# 0  0  1  3
#    x  y  z
# 3  2  3  3
# 4  2  1  0
#    x  y  z
# 1  3  1  1
# 2  3  0  0
# --------------
# gby_k
# --------------
#    x  y  z
# 0  0  1  3
#    x  y  z
# 0  0  1  3
#    x  y  z
# 3  2  3  3
# 4  2  1  0
#    x  y  z
# 1  3  1  1
# 2  3  0  0
# --------------
I considered the possibility that the default argument is actually True, but switching group_keys to explicitly False doesn't make a difference either.  What exactly is this argument for?
(Run on pandas version 0.18.1)
Edit:
I did find a way where group_keys changes behavior, based on this answer:
import pandas as pd
import numpy as np
row_idx = pd.MultiIndex.from_product(((0, 1), (2, 3, 4)))
d = pd.DataFrame([[4, 3], [1, 3], [1, 1], [2, 4], [0, 1], [4, 2]], index=row_idx)
df_n = d.groupby(level=0).apply(lambda x: x.nlargest(2, [0]))
#        0  1
# 0 0 2  4  3
#     3  1  3
# 1 1 4  4  2
#     2  2  4
df_k = d.groupby(level=0, group_keys=False).apply(lambda x: x.nlargest(2, [0]))
#      0  1
# 0 2  4  3
#   3  1  3
# 1 4  4  2
#   2  2  4
However, I'm still not clear on the intelligible principle behind what group_keys is supposed to do. This behavior does not seem intuitive based on @piRSquared's answer.
 
     
     
     
    

 
    