Let's say I have a pandas Series, and I want to access a set of elements at specific indices, like so:
In [1]:
from pandas import Series
import numpy as np
s = Series(np.arange(0,10))
In [2]: s.loc[[3,7]]
Out[2]:
3 3
7 7
dtype: int64
The .loc method accepts a list as the parameter for this type of selection. The .iloc and .ix methods work the same way.
However, if I use a tuple for the parameter, both .loc and .iloc fail:
In [5]: s.loc[(3,7)]
---------------------------------------------------------------------------
IndexingError Traceback (most recent call last)
........
IndexingError: Too many indexers
In [6]: s.iloc[(3,7)]
---------------------------------------------------------------------------
IndexingError Traceback (most recent call last)
........
IndexingError: Too many indexers
And .ix produces a strange result:
In [7]: s.ix[(3,7)]
Out[7]: 3
Now, I get that you can't even do this with a raw python list:
In [27]:
x = list(range(0,10))
x[(3,7)]
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-27-cefdde088328> in <module>()
1 x = list(range(0,10))
----> 2 x[(3,7)]
TypeError: list indices must be integers or slices, not tuple
To retrieve a set of specific indices from a list, you need to use a comprehension, as explained here.
But on the other hand, using a tuple to select rows from a pandas DataFrame seems to work fine for all three indexing methods. Here's an example with the .loc method:
In [8]:
from pandas import DataFrame
df = DataFrame({"x" : np.arange(0,10)})
In [9]:
df.loc[(3,7),"x"]
Out[9]:
3 3
7 7
Name: x, dtype: int64
My three questions are:
- Why won't the
Seriesindexers accept atuple? It would seem
natural to use atuplesince the set of desired indices is an
immutable, single-use parameter. Is this solely for the purpose of mimicking thelistinterface? - What is the explanation for the strange
Series.ixresult? - Why the inconsistency between
SeriesandDataFrameon this matter?