I am trying to split string in javascript by whitespaces, but ignoring whitespaces enclosed in quotes. So I googled this regular expression :(/\w+|"[^"]+"/g) but the problem is, that this isn't working with accented chars like á etc. So please how should I improve my regular expression to make it work?
            Asked
            
        
        
            Active
            
        
            Viewed 136 times
        
    0
            
            
         
    
    
        m3div0
        
- 1,556
- 3
- 17
- 32
- 
                    Can the string include quotes nested within quotes? If so, regex may not be the way to go. See this previous answer: http://stackoverflow.com/questions/133601/can-regular-expressions-be-used-to-match-nested-patterns – Tim Goodman Sep 23 '12 at 14:26
- 
                    no the quotes are used only to mark word that shouldn't be splitted, the problem is only with accented chars – m3div0 Sep 23 '12 at 14:29
- 
                    @david, are you using `split` or `exec`. If you're using the former then that regular expression is not what you want and in that case you should use the latter – Alexander Sep 23 '12 at 14:38
3 Answers
1
            
            
        That's because \w only matches [A-Za-z0-9_]. To match accented characters, add the unicode block range \x81-\xFF which includes the Latin-1 characters à and ã, et cetera:
(/[\w\x81-\xFF]+|"[^"]+"/g)
There's also this site, which is very helpful to build the required unicode block range.
 
    
    
        João Silva
        
- 89,303
- 29
- 152
- 158
1
            This matches non-spaces that don't contain quotes, and matches text between quotes:
/[^\s"]+|"[^"]+"/g
 
    
    
        Tim Goodman
        
- 23,308
- 7
- 64
- 83
0
            
            
        If you want to match all non-whitespace characters instead of only alphanumeric ones, replace \w with \S.
 
    
    
        Bergi
        
- 630,263
- 148
- 957
- 1,375
- 
                    1If the string contains `"foo bar"` this will separately match `"foo` and `bar"`, whereas I think he'd want to match `"foo bar"`. I used `[^\s"]` in my answer to avoid this. – Tim Goodman Sep 23 '12 at 14:49
-