You really shouldn't store multiple words in the same element. Make vectors like this:
genes <- c("gene1","gene2","gene3","gene4","gene5")
Anyway, assuming that you work with a data frame called df and assuming that your fourth column entries are indeed one single string where genes are separated by commas:
lis <- strsplit(df[,4], ",")
This will give is a list instead of a data frame, where every element contains all the genes separately. Next, make a list of the genes you are interested in (like above). Finally, do:
tab <- sapply(lis,function(x) any(genes %in% x))
Basically, for each row, %in% will check for each genes if it is in there. Next, the any command will return TRUE if any of the comparisons returns TRUE. So, if any of the genes is found in x, then it returns the value TRUE.
For example:
df <- structure(list(col1 = 1:10, col2 = 1:10, col3 = 1:10, col4 = c("gene1,gene2,gene3",
"gene2,gene3", "gene6,gene8", "gene9,gene10", "gene1,gene2,gene10",
"gene5", "gene3,gene6", "gene1,gene2,gene8", "gene6,gene7", "gene1,gene4"
)), .Names = c("col1", "col2", "col3", "col4"), row.names = c(NA,
-10L), class = "data.frame")
genes <- c("gene1","gene2","gene3","gene4","gene5")
lis <- strsplit(df[,4], ",")
tab <- sapply(lis,function(x) any(genes %in% x))
tab
# [1] TRUE TRUE FALSE FALSE TRUE TRUE TRUE TRUE FALSE TRUE
df
# col1 col2 col3 col4
# 1 1 1 1 gene1,gene2,gene3
# 2 2 2 2 gene2,gene3
# 3 3 3 3 gene6,gene8
# 4 4 4 4 gene9,gene10
# 5 5 5 5 gene1,gene2,gene10
# 6 6 6 6 gene5
# 7 7 7 7 gene3,gene6
# 8 8 8 8 gene1,gene2,gene8
# 9 9 9 9 gene6,gene7
# 10 10 10 10 gene1,gene4
Edit: Adjusted script according to clearer description.