I would like to capture the characters between the 1st and 2nd occurrence of '_' in this string:
C2_Sperd20A_XXX_20170301_20170331
That is:
Sperd20A
Thank you
I would like to capture the characters between the 1st and 2nd occurrence of '_' in this string:
C2_Sperd20A_XXX_20170301_20170331
That is:
Sperd20A
Thank you
We can use sub to match zero or more characters that are not a _ ([^_]*) from the start (^) of the string followed by a _ followed by one or more characters that are not a _ (([^_]+)) capture it as group ((...)) followed by _ and other characters, replace with the backreference (\\1) of the captured group
sub("^[^_]*_([^_]+)_.*", "\\1", str1)
#[1] "Sperd20A"
Or between the 2nd and 3rd _
sub("^([^_]*_){2}([^_]+).*", "\\2", str1)
#[1] "XXX"
Or another option is strsplit
strsplit(str1, "_")[[1]][2]
#[1] "Sperd20A"
If it is between 2nd and 3rd _
strsplit(str1, "_")[[1]][3]
#[1] "XXX"
###data
str1 <- "C2_Sperd20A_XXX_20170301_20170331"
A good option is to use the stringr package:
library(stringr)
s <- "C2_Sperd20A_XXX_20170301_20170331"
# (?<=foo) Lookbehind
# (?=foo) Lookahead
str_extract(string = s, pattern = "(?<=_)(.*?)(?=_)")
[1] "Sperd20A"