R regular expression question: I have a data.frame of job title and job descriptions and I need to
1) check if a job description contains an email address (can be .org, .edu, .gov, .com), and
2) extract the email address and the 5 words that precedes the email address
The dataset can contain web urls which can end in .edu, .com, etc. and also contains returns. Basically I was hoping to identify email address as anything that has [letters/numbers]@[letters/numbers](.org, .edu, .gov, .com, and whatever else an email can end in)
Here is a sample dataset:
    teststr = data.frame(job_title = c(1:8),
                 job_description = c('please send your resumes to adsf@dsf.com apply now!',
                                   'asdfa@asdf.com/adsf asdf',
                                   'visit us at sfds@adfa',
                                   'apply now',
                                   'follow us on @asdf.gov',
                                   'asdfa.gov',
                                   '.com',
                                   ''))
> teststr
  job_title                                     job_description
1         1 please send your resumes to adsf@dsf.com apply now!
2         2                            asdfa@asdf.com/adsf asdf
3         3                               visit us at sfds@adfa
4         4                                           apply now
5         5                              follow us on @asdf.gov
6         6                                           asdfa.gov
7         7                                                .com
8         8                                                    
I attempted at (1), but got the wrong answer
    grepl('(*@.+\\.com)|(*@\\S\\.gov)', teststr$job_description)
The correct result to (1) should be
      TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE
 
     
     
    