One of the most useful techniques in textmatching is regex.
Questions tagged [textmatching]
78 questions
                    
                    62
                    
            votes
                
                19 answers
            
        Is regular expression recognition of an email address hard?
I recently read somewhere that writing a regexp to match an email address, taking into account all the variations and possibilities of the standard is extremely hard and is significantly more complicated than what one would initially assume.
Why is…
         
    
    
        shoosh
        
- 76,898
- 55
- 205
- 325
                    16
                    
            votes
                
                4 answers
            
        How to do `cy.notContains(text)` in cypress?
I can check if text exists in cypress with cy.contains('hello'), but now I delete hello from the page, I want to check hello doesn't exist, how do I do something like cy.notContains('hello')?
         
    
    
        Alien
        
- 944
- 2
- 8
- 22
                    14
                    
            votes
                
                2 answers
            
        How to do Java String matching using Boolean Search Syntax?
I'm looking for a Java/Scala library that can take an user query and a text and returns if there was a matching or not.
I'm processing a stream of information, ie: Twitter Stream, and can't afford to use a batching process, I need to evaluate each…
         
    
    
        arjones
        
- 460
- 3
- 12
                    13
                    
            votes
                
                4 answers
            
        Search with various combinations of space, hyphen, casing and punctuations
My schema:
  
    
        
            
            
                
                    
    
    
         
    
    
                
            
        
       
 
    
    
        Sudheer Aedama
        
- 2,116
- 2
- 21
- 39
                    11
                    
            votes
                
                1 answer
            
        Postgresql - converting text to ts_vector
Sorry for the basic question.
I have a table with the following columns.
      Column |  Type   | Modifiers 
     --------+---------+-----------
      id     | integer | 
      doc_id | bigint  | 
      text   | text    | 
I am trying to do text…
         
    
    
        CISCO
        
- 539
- 1
- 4
- 14
                    5
                    
            votes
                
                5 answers
            
        Python dictionary replacement with space in key
I have a string and a dictionary, I have to replace every occurrence of the dict key in that text.
text = 'I have a smartphone and a Smart TV'
dict = {
    'smartphone': 'toy',
    'smart tv': 'junk'
}
If there is no space in keys, I will break the…
         
    
    
        James
        
- 13,571
- 6
- 61
- 83
                    5
                    
            votes
                
                5 answers
            
        Data Comparison
We have a SQL Server table containing Company Name, Address, and Contact name (among others).
We regularly receive data files from outside sources that require us to match up against this table.  Unfortunately, the data is slightly different since…
         
    
    
        wcm
        
- 9,045
- 7
- 39
- 64
                    4
                    
            votes
                
                1 answer
            
        Cluster sequences of strings in R
I have to following data:
attributes <- c("apple-water-orange", "apple-water", "apple-orange", "coffee", "coffee-croissant", "green-red-yellow", "green-red-blue", "green-red","black-white","black-white-purple")
attributes 
           attributes 
1 …
         
    
    
        constiii
        
- 638
- 3
- 19
                    4
                    
            votes
                
                1 answer
            
        How do I group companies having different names but are essentially the same semantically?
I am doing competitor analysis using Open Government Data from UK public sector. But there are some anomalies in my results. When I am grouping the contracts by the company names, there are a lot of issues like companies are misspelt or they vary in…
         
    
    
        Tejasvi Gaurav
        
- 43
- 5
                    3
                    
            votes
                
                7 answers
            
        How to match URIs in text?
How would one go about spotting URIs in a block of text?
The idea is to turn such runs of texts into links. This is pretty simple to do if one only considered the http(s) and ftp(s) schemes; however, I am guessing the general problem (considering…
        Ufuk Kayserilioglu
                    3
                    
            votes
                
                2 answers
            
        Is there a better way to capture all the regex patterns in matching with nested lists within a dictionary?
I am trying out a simple text-matching activity where I scraped titles of blog posts and try to match it with my pre-defined categories once I find specific keywords.
So for example, the title of the blog post is 
"Capture Perfect Night Shots with…
         
    
    
        Nicoconut
        
- 33
- 4
                    3
                    
            votes
                
                2 answers
            
        android espresso test is fails always in text matching
I have a problem in espresso test, I don't know why matching the text is always fail with me, I even tried to create simple app has two activities, the first activity has textview and two buttons one button show toast another go next activity,…
         
    
    
        Rooh Al-mahaba
        
- 594
- 1
- 14
- 28
                    3
                    
            votes
                
                0 answers
            
        Record Linkage with multiple datasets
The problem
fastLink and RecordLinkage packages do extremely well in matching records (rows) from database A to database B and vice-versa. The developers are working on extending from matching only 2 databases to multiple databases.
A simple example…
         
    
    
        Yeshyyy
        
- 669
- 6
- 21
                    3
                    
            votes
                
                0 answers
            
        text matching, semantic similarity, match the similar phrase/ words python semantic wordNet FuzzyMatch
By using wordnet text matching I realized that the wordnet can only match a single word to a single word. It cannot match a single word to a phrase.
As you can see, I has two lists.
list1=['fruit', 'world']
list2=[u'domain', u'creation Year',…
         
    
    
        bob90937
        
- 553
- 1
- 5
- 18
                    3
                    
            votes
                
                1 answer
            
        How can I match string order between two documents in Perl?
I've a problem in making a PERL program for matching the words in two documents. Let's say there are documents A and B.
So I want to delete the words in document A that's not in the document B.
Example 1:
A: I eat pizza
B: She go to the market and…
         
    
    
        Randy
        
- 33
- 6