In previous versions of R I could combine factor levels that didn't have a "significant" threshold of volume using the following little function:
whittle = function(data, cutoff_val){
  #convert to a data frame
  tab = as.data.frame.table(table(data))
  #returns vector of indices where value is below cutoff_val
  idx = which(tab$Freq < cutoff_val)
  levels(data)[idx] = "Other"
  return(data)
}
This takes in a factor vector, looks for levels that don't appear "often enough" and combines all of those levels into one "Other" factor level. An example of this is as follows:
> sort(table(data$State))
   05    27    35    40    54    84     9    AP    AU    BE    BI    DI     G    GP    GU    GZ    HN    HR    JA    JM    KE    KU     L    LD    LI    MH    NA 
    1     1     1     1     1     1     1     1     1     1     1     1     1     1     1     1     1     1     1     1     1     1     1     1     1     1     1 
   OU     P    PL    RM    SR    TB    TP    TW     U    VD    VI    VS    WS     X    ZH    47    BL    BS    DL     M    MB    NB    RP    TU    11    DU    KA 
    1     1     1     1     1     1     1     1     1     1     1     1     1     1     1     2     2     2     2     2     2     2     2     2     3     3     3 
   BW    ND    NS    WY    AK    SD    13    QC    01    BC    MT    AB    HE    ID     J    NO    LN    NM    ON    NE    VT    UT    IA    MS    AO    AR    ME 
    4     4     4     4     5     5     6     6     7     7     7     8     8     8     9    10    11    17    23    26    26    30    31    31    38    40    44 
   OR    KS    HI    NV    WI    OK    KY    IN    WV    AL    CO    WA    MN    NH    MO    SC    LA    TN    AZ    IL    NC    MI    GA    OH    **    CT    DE 
   45    47    48    57    57    64   106   108   112   113   120   125   131   131   135   138   198   200   233   492   511   579   645   646   840   873  1432 
   RI    DC    TX    MA    FL    VA    MD    CA    NJ    PA    NY 
 1782  2513  6992  7027 10527 11016 11836 12221 15485 16359 34045 
Now when I use whittle it returns me the following message:
> delete = whittle(data$State, 1000)
Warning message:
In `levels<-`(`*tmp*`, value = c("Other", "Other", "Other", "Other",  :
  duplicated levels in factors are deprecated
How can I modify my function so that it has the same effect but doesn't use these "deprecated" factor levels? Converting to a character, tabling, and then converting to the character "Other"?
 
     
     
     
     
    