I have a huge data frame in R some of my columns are
STATE FIPS 
COUNTY FIPS 
EVENT TYPE
EPISODE
I want to aggregate the data as follows
first split it up by state, and then by county within each state, then by event type within each county, and then finally remove all the duplicate "episodes" within each event type. Finally, I want to count the value of each and have that be the output.
I would like the final data frame to be a list of
Every county (with its respective state), the event type, and the count of episode numbers for each event.
For example.
Maryland    Montgomery  Tornado 7
Maryland    Howard      Tornado    19
New York    Ontario    Hurricane 7
and so on so that each county and each event has its own row.
Happy to post dummy data but I am not sure how to do that... thanks!
structure(list(EPISODE_ID = c(1099647L, 1104972L, 1104973L, 1104971L, 1105463L, 1099373L, 1104298L, 1105465L, 1102576L, 1102576L, 1102576L, 1102581L, 1101019L, 1099291L, 1099292L, 1105466L, 1105467L), EVENT_ID = c(5176080, 5166972, 5166973, 5166971, 5167810, 5163387, 5161925, 5168217, 5171130, 5171131, 5171132, 5171137, 5173230, 5164469, 5164470, 5168218, 5168219), STATE = c("ALASKA", "ALABAMA", "ALABAMA", "ALABAMA", "ALABAMA", "ALABAMA", "ALABAMA", "ALABAMA", "ALABAMA", "ALABAMA", "ALABAMA", "ALABAMA", "ALASKA", "ALASKA", "ALASKA", "ALABAMA", "ALABAMA"), STATE_FIPS = c(2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 1, 1), YEAR = c(2000, 2000, 2000, 2000, 2000, 2000, 2000, 2000, 2000, 2000, 2000, 2000, 2000, 2000, 2000, 2000, 2000), MONTH_NAME = c("August", "December", "December", "December", "December", "July", "November", "December", "September", "September", "September", "September", "September", "July", "July", "December", "December"), EVENT_TYPE = c("Flash Flood", "Tornado", "Tornado", "Tornado", "Tornado", "Flash Flood", "Tornado", "Tornado", "Flash Flood", "Flash Flood", "Flash Flood", "Flash Flood", "Flash Flood", "Flash Flood", "Flash Flood", "Tornado", "Tornado"), CZ_TYPE = c("C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C" ), CZ_FIPS = c(14, 115, 55, 125, 61, 73, 65, 69, 1, 101, 51, 111, 14, 7, 5, 45, 67)), row.names = c(NA, -17L), class = c("data.table", "data.frame"))
This display :
   EPISODE_ID EVENT_ID   STATE STATE_FIPS YEAR MONTH_NAME  EVENT_TYPE CZ_TYPE
1     1099647  5176080  ALASKA          2 2000     August Flash Flood       C
2     1104972  5166972 ALABAMA          1 2000   December     Tornado       C
3     1104973  5166973 ALABAMA          1 2000   December     Tornado       C
4     1104971  5166971 ALABAMA          1 2000   December     Tornado       C
5     1105463  5167810 ALABAMA          1 2000   December     Tornado       C
6     1099373  5163387 ALABAMA          1 2000       July Flash Flood       C
7     1104298  5161925 ALABAMA          1 2000   November     Tornado       C
8     1105465  5168217 ALABAMA          1 2000   December     Tornado       C
9     1102576  5171130 ALABAMA          1 2000  September Flash Flood       C
10    1102576  5171131 ALABAMA          1 2000  September Flash Flood       C
11    1102576  5171132 ALABAMA          1 2000  September Flash Flood       C
12    1102581  5171137 ALABAMA          1 2000  September Flash Flood       C
13    1101019  5173230  ALASKA          2 2000  September Flash Flood       C
14    1099291  5164469  ALASKA          2 2000       July Flash Flood       C
15    1099292  5164470  ALASKA          2 2000       July Flash Flood       C
16    1105466  5168218 ALABAMA          1 2000   December     Tornado       C
17    1105467  5168219 ALABAMA          1 2000   December     Tornado       C
   CZ_FIPS
1       14
2      115
3       55
4      125
5       61
6       73
7       65
8       69
9        1
10     101
11      51
12     111
13      14
14       7
15       5
16      45
17      67
 
     
    