I have a data.frame with 11717 obs. of 15 variables. See below:
$ SCC                : Factor w/ 11717 levels "10100101","10100102",..: 1 2 3 4 5 6 7 8 9 10 ...
$ Data.Category      : Factor w/ 6 levels "Biogenic","Event",..: 6 6 6 6 6 6 6 6 6 6 ...
$ Short.Name         : Factor w/ 11238 levels "","2,4-D Salts and Esters Prod /Process Vents, 2,4-D Recovery: Filtration",..: 3283 3284 3293 3291 3290 3294 3295 3296 3292 3289 ...
$ EI.Sector          : Factor w/ 59 levels "Agriculture - Crops & Livestock Dust",..: 18 18 18 18 18 18 18 18 18 18 ...
$ Option.Group       : Factor w/ 25 levels "","C/I Kerosene",..: 1 1 1 1 1 1 1 1 1 1 ...
$ Option.Set         : Factor w/ 18 levels "","A","B","B1A",..: 1 1 1 1 1 1 1 1 1 1 ...
$ SCC.Level.One      : Factor w/ 17 levels "Brick Kilns",..: 3 3 3 3 3 3 3 3 3 3 ...
$ SCC.Level.Two      : Factor w/ 146 levels "","Agricultural Chemicals Production",..: 32 32 32 32 32 32 32 32 32 32 ...
$ SCC.Level.Three    : Factor w/ 1061 levels "","100% Biosolids (e.g., sewage sludge, manure, mixtures of these matls)",..: 88 88 156 156 156 156 156 156 156 156 ...
$ SCC.Level.Four     : Factor w/ 6084 levels "","(NH4)2 SO4 Acid Bath System and Evaporator",..: 4455 5583 4466 4458 1341 5246 5584 5983 4461 776 ...
$ Map.To             : num  NA NA NA NA NA NA NA NA NA NA ...
$ Last.Inventory.Year: int  NA NA NA NA NA NA NA NA NA NA ...
$ Created_Date       : Factor w/ 57 levels "","1/27/2000 0:00:00",..: 1 1 1 1 1 1 1 1 1 1 ...
$ Revised_Date       : Factor w/ 44 levels "","1/27/2000 0:00:00",..: 1 1 1 1 1 1 1 1 1 1 ...
$ Usage.Notes        : Factor w/ 21 levels ""," ","includes bleaching towers, washer hoods, filtrate tanks, vacuum pump exhausts",..: 1 1 1 1 1 1 1 1 1 1 ...
I am trying to make a search for the words "Combustion" and "Coal" and create a subset showing only where "Combustion" and "Coal" are combined in the same sentence OR the same row anywhere in the data.frame:
example of the words used in same sentence:
Fuel Comb - Electric Generation - Coal.
example of the words used in same row / different columns:
see screenshot (I don't have enough creds to attach a img). [screenshot][1]
Using RStudio search shows: 675 results for "Comb" and 251 results for "Coal". So the final combination should be equal or less than 251 if I'm correct.
I tried using grep and grepl. However the only way for me to use these functions is to repeat the process across each column before creating the subset (using match function for instance).
I find this to be a time consuming process. Would you have a better one?
[1]: https://i.stack.imgur.com/YJr5B.png
 
     
    