I have written code to parse a large set of emails (640,000 files) with the output being a listing of email filenames with specific dates. The code is as follows:
def createListOfFilesByDate():
    searchDates = ["12 Mar 2012","13 Mar 2012"]
    outfile = "EmailList.txt"    
    sent = "Sent:"
    fileList=glob.glob("./Emails/*.txt")
    foundDate = False
    fout = open(outfile,'w')
    for filename in fileList:
        foundDate = False
        with open(filename) as f:                  
            header = [next(f) for x in xrange(10)]           
            f.close()
            for line in header:            
                if sent in line:
                    for searchDate in searchDates:                                                    
                        if searchDate in line:
                            foundDate = True
                            break
                if foundDate == True:                                                    
                    fout.write(filename + '\n')
                    break
    fout.close()
The problem is that the code processes the first 10,000 emails quite quickly but then starts to slow down significantly and takes a long time to cover the remaining emails. I have investigated a number of possible reasons but not found one. I wonder if I am doing something inefficiently.
 
    