I have some pretty large text files (>2g) that I would like to process word by word. The files are space-delimited text files with no line breaks (all words are in a single line). I want to take each word, test if it is a dictionary word (using enchant), and if so, write it to a new file.
This is my code right now:
with open('big_file_of_words', 'r') as in_file:
        with open('output_file', 'w') as out_file:
            words = in_file.read().split(' ')
            for word in word:
                if d.check(word) == True:
                    out_file.write("%s " % word)
I looked at lazy method for reading big file in python, which suggests using yield to read in chunks, but I am concerned that using chunks of predetermined size will split words in the middle.  Basically, I want chunks to be as close to the specified size while splitting only on spaces.  Any suggestions?
 
     
     
     
    