I am trying to create a dummy file to make some ML predictions afterwards. The input are about 2000 'routes' and I want to create a dummy that contains year-month-day-hour combinations for 7 days, meaning 168 rows per route, about 350k rows in total. The problem that I am facing is that pandas becomes terribly slow in appending rows at a certain size.
I am using the following code:
DAYS = [0, 1, 2, 3, 4, 5, 6]
HODS = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23]
ISODOW = {
    1: "monday",
    2: "tuesday",
    3: "wednesday",
    4: "thursday",
    5: "friday",
    6: "saturday",
    7: "sunday"
}
def createMyPredictionDummy(start=datetime.datetime.now(), sourceFile=(utils.mountBasePath + 'routeProperties.csv'), destFile=(utils.outputBasePath + 'ToBePredictedTTimes.csv')):
    '''Generate a dummy file that can be used for predictions'''
    data = ['route', 'someProperties']
    dataFile = data + ['yr', 'month', 'day', 'dow', 'hod']
    # New DataFrame with all required columns
    file = pd.DataFrame(columns=dataFile)
    # Old data frame that has only the target columns    
    df = pd.read_csv(sourceFile, converters=convert, delimiter=',')
    df = df[data]
    # Counter - To avoid constant lookup for length of the DF
    ix = 0
    routes = df['route'].drop_duplicates().tolist()
    # Iterate through all routes and create a row for every route-yr-month-day-hour combination for 7 day -->  about 350k rows
    for no, route in enumerate(routes):
        print('Current route is %s which is no. %g out of %g' % (str(route), no+1, len(routes)))
        routeDF = df.loc[df['route'] == route].iloc[0].tolist()
        for i in range(0, 7):
            tmpDate = start + datetime.timedelta(days=i)
            day = tmpDate.day
            month = tmpDate.month
            year = tmpDate.year
            dow = ISODOW[tmpDate.isoweekday()]
            for hod in HODS:
                file.loc[ix] = routeDF + [year, month, day, dow, hod] # This is becoming terribly slow
                ix += 1
    file.to_csv(destFile, index=False)
    print('Wrote file')
I think the main problem lies in appending the row with .loc[] - Is there any way to append a row more efficiently? 
If you have any other suggestions, I am happy to hear them all! 
Thanks and best,
carbee
 
     
    