I have to read large .csv of around 20MB. Those files are tables composed by 8 columns and 5198 rows. I have to do some statistics over a specific column I.
I have n different files and this what I am doing:
stat = np.arange(n)
I = 0
for k in stat:
df = pd.read_csv(pathS+'run_TestRandom_%d.csv'%k, sep=' ')
I+=df['I']
I = I/k ## Average
This process takes 0.65s and I wondering if there is a fastest way.