I am using aggregate to get the means of several variables by a specific category (cy), but there are a few NA's in my dataframe. I am using aggregate rather than ddply because from my understanding it takes care of NA's similarly to using rm.na=TRUE. The problem is that it drops all rows containing NA in the output, so the means are slightly off.
Dataframe:
>      bt     cy   cl   pf  ne       YH    YI
1        1      H   1    95 70.0      20    20
2        2      H   1    25 70.0      46    50
3        1      H   1     0 70.0      40    45
4        2      H   1    95 59.9      40    40
5        2      H   1    75 59.9      36    57
6        2      H   1     5 70.0      35    43
7        1      H   1    50 59.9      20    36
8        2      H   1    95 59.9      40    42
9        3      H   1    95 49.5      17    48
10       2      H   1     5 70.0      42    42
11       2      H   1    95 49.5      19    30
12       3      H   1    25 49.5      33    51
13       1      H   1    75 49.5       5    26
14       1      H   1     5 70.0      35    37
15       1      H   1     5 59.9      20    40
16       2      H   1    95 49.5      29    53
17       2      H   1    75 70.0      41    41
18       2      H   1     0 70.0      10    10
19       2      H   1    95 49.5      25    32
20       1      H   1    95 59.9      10    11
21       2      H   1     0 29.5      20    28
22       1      H   1    95 29.5      11    27
23       2      H   1    25 59.9      26    26
24       1      H   1     5 70.0      30    30
25       3      H   1    25 29.5      20    30
26       3      H   1    50 70.0       5     5
27       1      H   1     0 59.9       3    10
28       1      K   1     5 49.5      25    29
29       2      K   1     0 49.5      30    32
30       1      K   1    95 49.5      13    24
31       1      K   1     0 39.5      13    13
32       2      M   1    NA 70.0      45    50
33       3      M   1    25 59.9       3    34'
The full dataframe has 74 rows, and there are NA's peppered throughout all but two columns (cy and cl).
My code looks like this:
meancnty<-(aggregate(cbind(pf,ne,YH,YI)~cy, data = newChart, FUN=mean))
I double checked in excel, and the means this function produces are for a dataset of N=69, after removing all rows containing NA's. Is there any way to tell R to ignore the NA's rather than remove the rows, other than taking the mean of each variable by county (I have a lot of variables to summarize by many different categories)?
Thank you