I have a data frame with a row of data like this:
play_by_play = pd.DataFrame([{
    "players": "Tom Brady; Mike Evans; Tristan Wirfs; Leonard Fournette; Chris Godwin", 
               "down": 1, 
               "to_go": 10,
                "play_type": 'pass',
               "yards_gained": 8,
               "pass_attempt": 1,
               "complete_pass": 1,
               "rush_attempt": 0
              }])
I want to keep track of stats while the given players are on the field by using group by and aggregate functions. If I were looking to do this player-by-player, I would use play_by_play["players"].str.contains("Tom Brady") and aggregating the data using that filter, but I am looking to automate this. The solution I've landed on is to duplicate rows and have the "players" value split for each row. It would look something like this:
| player | down | to_go | play_type | yards_gained | pass_attempt | complete_pass | rush_attempt | 
|---|---|---|---|---|---|---|---|
| "Tom Brady" | 1 | 10 | pass | 8 | 1 | 1 | 0 | 
| "Mike Evans" | 1 | 10 | pass | 8 | 1 | 1 | 0 | 
| "Tristan Wirfs" | 1 | 10 | pass | 8 | 1 | 1 | 0 | 
| "Leonard Fournette" | 1 | 10 | pass | 8 | 1 | 1 | 0 | 
| "Chris Godwin" | 1 | 10 | pass | 8 | 1 | 1 | 0 | 
How could I accomplish this? As I mentioned before, this needs to be pretty scalable, as there are thousands of rows of data. If there's an easier way to group by based on unique values contained in a semicolon-separated column, I'm more than happy to go that route.
 
     
    