I need to remove the duplicate in a str and return a unique value using regex pattern.
Value = "['234.78','234.78']"
Expected result:
Value ="['234.78']"
Any help would be appreciated!
I need to remove the duplicate in a str and return a unique value using regex pattern.
Value = "['234.78','234.78']"
Expected result:
Value ="['234.78']"
Any help would be appreciated!
 
    
     
    
    Looks like you're working with formatted string data - you'd be better off using something like ast.literal_eval() instead if it's a Python-compatible data structure, rather than regular expressions. From there, you can use the list -> set -> list methodology to remove duplicates, then repr() to get the string representation back the way you prescribe:
import ast
value = "['234.78','234.78']"
value = list(set(ast.literal_eval(value)))
value = repr(value) # "['234.78']"
 
    
    Since the data is almost JSON format, you could replace the single-quotes with double-quotes:
import json
def dedupe_serialized_list(serialized_list: str):
    """
    Dedupe a serialized list of str values.
    :param str serialized_list: A serialized list of str values
    :return: a deduped list (re-serialized)
    :rtype: str
    """
    return str(list(set(json.loads(serialized_list.replace("'", '"')))))
if __name__ == '__main__':
    print(dedupe_serialized_list("['234.78','234.78']")) # ['234.78']
As a lambda:
dedupe = lambda value: str(list(set(json.loads(value.replace("'", '"')))))
 
    
    Esqew's answer is a sensible approach.
If you're desperate to do it with regex, then the below code works:
import re
Value = "['234.78','234.78']"
Value=re.sub(r"('\d+\.\d+'),\1",r'\1',Value)
Value #"['234.78']
The matching pattern looks for a quote, ≥1 numbers, a decimal point and ≥1 numbers, a quote, a comma, then the number repeated.
