I have a project where I need to read data from a relatively large .txt file that contains 5 columns and about 25 million rows of comma-separted-data, process the data, and then write the processed data to a new .txt file. My computer freezes when I try to process a file this large.
I've already written the function to process the data and it works on small input .txt files, so I just need to adjust it to work with the larger file.
Here's an abridged version of my code:
import csv
import sys
def process_data(input_file, output_file):
    prod_dict = {}
    with open(input_file, "r") as file:
        # some code that reads all data from input file into dictionary
    # some code that sorts dictionary into an array with desired row order
    # list comprehension code that puts array into desired output form
    with open(output_file, 'w') as myfile:
        wr = csv.writer(myfile)
        for i in final_array:
            wr.writerow(i)
def main():
    input_file = sys.argv[1]
    output_file = sys.argv[2]
    process_data(input_file, output_file)
if __name__ == '__main__':
    main()
 
     
    