I have a dataframe and a column of ordered pairs of (latitude, longitude) as factors within that dataframe that I would like to extract into columns of just the latitude and longitude values separately into numerics. How would I get rid of the commas and parentheses and place the factors into their own columns as numbers?
            Asked
            
        
        
            Active
            
        
            Viewed 102 times
        
    -1
            
            
        - 
                    2As was mentioned in your previous question, please make this question reproducible and self-contained. By that I mean including attempted code (please be explicit about non-base packages), sample representative data (perhaps via `dput(head(x))` or building data programmatically (e.g., `data.frame(...)`), possibly stochastically after `set.seed(1)`), perhaps actual output (with verbatim errors/warnings) versus intended output. Refs: https://stackoverflow.com/q/5963269, [mcve], and https://stackoverflow.com/tags/r/info. – r2evans Dec 14 '20 at 21:22
1 Answers
1
            
            
        Several ways, but I'll focus on strcapture. My sample data:
somecoords <- c("(1.1,2.2)","(3.3,4.4)")
# if not 'character', then
somecoords <- as.character(somecoords)
strcapture starts with a vector of strings and returns a data.frame:
strcapture("\\D*(-?[0-9]+\\.?[0-9]*),(-?[0-9]+\\.?[0-9]*)\\D?.*$",
           somecoords, proto = list(num1=0, num2=0))
#   num1 num2
# 1  1.1  2.2
# 2  3.3  4.4
Regex walk-through:
- \\D*zero or more non-digit characters
- (...)a capture group, saved by- strcaptureinto a column
- -?a literal dash/hyphen, optional
- [0-9]+one or more digits
- \\.?literal dot, optional, in case there are whole-number coordinates in your data
- [0-9]*zero or more digits
- ,literal comma
- \\D?.*optional non-digit character, followed by zero or more of anything
- $end of string (perhaps not required, since- .*should have expanded fully
 
    
    
        r2evans
        
- 141,215
- 6
- 77
- 149
