To get the number of rows in a dataframe use:
df.shape[0]
(and df.shape[1] to get the number of columns).
As an alternative you can use
len(df)
or
len(df.index)
(and len(df.columns) for the columns)
shape is more versatile and more convenient than len(), especially for interactive work (just needs to be added at the end), but len is a bit faster (see also this answer).
To avoid: count() because it returns the number of non-NA/null observations over requested axis
len(df.index) is faster
import pandas as pd
import numpy as np
df = pd.DataFrame(np.arange(24).reshape(8, 3),columns=['A', 'B', 'C'])
df['A'][5]=np.nan
df
# Out:
# A B C
# 0 0 1 2
# 1 3 4 5
# 2 6 7 8
# 3 9 10 11
# 4 12 13 14
# 5 NaN 16 17
# 6 18 19 20
# 7 21 22 23
%timeit df.shape[0]
# 100000 loops, best of 3: 4.22 µs per loop
%timeit len(df)
# 100000 loops, best of 3: 2.26 µs per loop
%timeit len(df.index)
# 1000000 loops, best of 3: 1.46 µs per loop
df.__len__ is just a call to len(df.index)
import inspect
print(inspect.getsource(pd.DataFrame.__len__))
# Out:
# def __len__(self):
# """Returns length of info axis, but here we use the index """
# return len(self.index)
Why you should not use count()
df.count()
# Out:
# A 7
# B 8
# C 8