I am trying to find a way to shuffle the lines of a large csv files in Python and then split it into multiple csv files (assigning a number of rows for each files) but I can't manage to find a way to shuffle the large dataset, and keep the headers in each csv. It would help a lot if someone would know how to
Here's the code I found useful for splitting a csv file:
number_of_rows = 100
def write_splitted_csvs(part, lines):
    with open('mycsvhere.csv'+ str(part) +'.csv', 'w') as f_out:
        f_out.write(header)
        f_out.writelines(lines)
with open("mycsvhere.csv", "r") as f:
    count = 0
    header = f.readline()
    lines = []
    for line in f:
        count += 1
        lines.append(line)
        if count % number_of_rows == 0:
            write_splitted_csvs(count // number_of_rows, lines)
            lines = []
    
    if len(lines) > 0:
        write_splitted_csvs((count // number_of_rows) + 1, lines)
If anyone knows how to shuffle all these splitted csv this would help a lot! Thank you very much
 
     
    