I'm trying to reduce the data size so I came up with these conditions: For each month, I only want to randomly select 5 to 10 sale records of each dealer (dealers with unique ID). The Data looks like this:
   Date | Product  | Revenue | Dealer ID
Jan 7,18| XXX      | 10      | 1212
Jan 7,18| YYY      | 13      | 1212
Jan 7,18| XXX      | 20      | 2500
Jan 7,18| ZZZ      | 5       | 1212
....
Jan 8,18| ZZZ      | 15      | 1212
Jan 8,18| AAA      | 17      | 2500
Jan 8,18| MMM      | 9       | 1318
...
and the output of a dealer's January sale record should look like this:
   Date  | Product  | Revenue | Dealer ID
Jan 7,18 | XXX      | 10      | 1212
Jan 7,18 | ZZZ      | 5       | 1212
Jan 10,18| ZZZ      | 15      | 1212
Jan 17,18| AAA      | 17      | 1212
Jan 22,18| MMM      | 9       | 1212
Jan 27,18| ZZZ      | 15      | 1212
Jan 28,18| MMM      | 9       | 1212
...
I would write a nested for loop. for each dealer ID, for each month, randomly choose n number of entries. n being a random number from 5 to 10. I'm not quite sure how to loop through months, and can't seem to find a way to grab random entries..
Does anyone have an easier way to do this task? Here's my attempt:
unique_ID = np.unique(df['Dealer ID'].sort_values(ascending=True))
months = ["January", "Feburary", "March", "April", "May", "June", "July", "August", "September", "October", "November", "December"]
years = range(2018, 2022)
for y in years:
    for m in months:
        for i in unique_ID:
            if df['Dealer ID'] == i: 'have to loop through the file and pick out all the entries with that Dealer ID'
                'create a list to store them'
                'and then randomly select 8 entries from each of the dealer'
