Here is a summary of definitions.
container
- An object with a
__contains__ method
generator
- A function which returns an iterator.
iterable
- A object with an
__iter__() or __getitem__() method.
- Examples of iterables include all sequence types (such as list,
str, and tuple) and some non-sequence types like dict and file.
- When an iterable object is passed as an argument to the builtin
function
iter(), it returns an iterator for the object. This
iterator is good for one pass over the set of values.
iterator
- An iterable which has a
next() method.
- Iterators are required to have an
__iter__() method that returns the iterator object itself.
- An iterator is
good for one pass over the set of values.
sequence
- An iterable which supports efficient element access using integer
indices
via the
__getitem__() special method and defines a len() method that returns
the length of the sequence.
- Some built-in sequence types are
list, str,
tuple, and unicode.
- Note that dict also supports
__getitem__() and
__len__(), but is considered a mapping rather than a sequence because the
lookups use arbitrary immutable keys rather than integers.
Now there is a multitude of ways of testing if an object is an iterable, or iterator, or sequence of some sort. Here is a summary of these ways, and how they classify various kinds of objects:
Iterable Iterator iter_is_self Sequence MutableSeq
object
[] True False False True True
() True False False True False
set([]) True False False False False
{} True False False False False
deque([]) True False False False False
<listiterator> True True True False False
<generator> True True True False False
string True False False True False
unicode True False False True False
<open> True True True False False
xrange(1) True False False True False
Foo.__iter__ True False False False False
Sized has_len has_iter has_contains
object
[] True True True True
() True True True True
set([]) True True True True
{} True True True True
deque([]) True True True False
<listiterator> False False True False
<generator> False False True False
string True True False True
unicode True True False True
<open> False False True False
xrange(1) True True True False
Foo.__iter__ False False True False
Each columns refers to a different way to classify iterables, each rows refers to a different kind of object.
import pandas as pd
import collections
import os
def col_iterable(obj):
return isinstance(obj, collections.Iterable)
def col_iterator(obj):
return isinstance(obj, collections.Iterator)
def col_sequence(obj):
return isinstance(obj, collections.Sequence)
def col_mutable_sequence(obj):
return isinstance(obj, collections.MutableSequence)
def col_sized(obj):
return isinstance(obj, collections.Sized)
def has_len(obj):
return hasattr(obj, '__len__')
def listtype(obj):
return isinstance(obj, types.ListType)
def tupletype(obj):
return isinstance(obj, types.TupleType)
def has_iter(obj):
"Could this be a way to distinguish basestrings from other iterables?"
return hasattr(obj, '__iter__')
def has_contains(obj):
return hasattr(obj, '__contains__')
def iter_is_self(obj):
"Seems identical to col_iterator"
return iter(obj) is obj
def gen():
yield
def short_str(obj):
text = str(obj)
if text.startswith('<'):
text = text.split()[0] + '>'
return text
def isiterable():
class Foo(object):
def __init__(self):
self.data = [1, 2, 3]
def __iter__(self):
while True:
try:
yield self.data.pop(0)
except IndexError: # pop from empty list
return
def __repr__(self):
return "Foo.__iter__"
filename = 'mytestfile'
f = open(filename, 'w')
objs = [list(), tuple(), set(), dict(),
collections.deque(), iter([]), gen(), 'string', u'unicode',
f, xrange(1), Foo()]
tests = [
(short_str, 'object'),
(col_iterable, 'Iterable'),
(col_iterator, 'Iterator'),
(iter_is_self, 'iter_is_self'),
(col_sequence, 'Sequence'),
(col_mutable_sequence, 'MutableSeq'),
(col_sized, 'Sized'),
(has_len, 'has_len'),
(has_iter, 'has_iter'),
(has_contains, 'has_contains'),
]
funcs, labels = zip(*tests)
data = [[test(obj) for test in funcs] for obj in objs]
f.close()
os.unlink(filename)
df = pd.DataFrame(data, columns=labels)
df = df.set_index('object')
print(df.ix[:, 'Iterable':'MutableSeq'])
print
print(df.ix[:, 'Sized':])
isiterable()