I have this regex :
cont_we_re = r"((?!\S+\s?(?:(cbf|cd3|cbm|m3|m[\\\>\?et]?|f3|ft3)))(?:([0-9,\.]+){2,})(?:\s*(?:(lb|kg)\.?s?))?)"
Right now, the logic followed is match with any numeric chunk optionally if followed by only kgs or lbs but don't match if cbf, cd3, cbm, m3 etc. are found after the numeric chunk. It works perfectly for these sample cases :
s1 = "18300 kg 40344.6 lbs 25000 m3"
s2 = "18300kg 40344.6lbs 25000m3"
s3 = "18300 kg   KO"
s4 = "40344.6 lb5   "
s5 = "40344.6  "
I'm using re.finditer() with re.IGNORECASE flag, like this :
for s in [s1, s2, s3, s4, s5]:
    all_val = [i.group().strip() for i in re.finditer(cont_we_re, s, re.IGNORECASE)]
Gives me this output :
['18300 kg', '40344.6 lbs']
['18300kg', '40344.6lbs']
['18300 kg']
['40344.6 lb']
['40344.6']
Now I'm trying to implement another logic : if we find numeric chunk followed by lbs then match it with first priority and return only that match, but if not found lbs and found only numeric chunk or numeric chunk followed by kgs then take those.
I've done this without changing the regex, like this :
for s in [s1, s2, s3, s4, s5]:
    all_val = [i.group().strip() for i in re.finditer(cont_we_re, s, re.IGNORECASE)]
    kg_val = [i for i in all_val if re.findall(r"kg\.?s?", i)]
    lb_val = [i for i in all_val if re.findall(r"lb\.?s?", i)]
    final_val = lb_val if lb_val else (kg_val if kg_val else list(set(all_val) - (set(kg_val+lb_val))))
This gives me the desired output (which is perfect) :
['40344.6 lbs']
['40344.6lbs']
['18300 kg']
['40344.6 lb']
['40344.6']
Question is how can I apply this same logic in the regex, without finding for kgs and lbs separately on each matched group by cont_we_re for each string. I tried "IF-THEN-ELSE" type regex as portrayed in this question but it doesn't work as the first part of the regex (? supposedly yields pattern error in python. Is there any way to do this with only cont_we_re regex?
 
     
     
    