I tried to create a program to check the genome sequence.
Context:
Biologists use a sequence of letters A, C, T and G to model a genome.
A gene is a substring of a genome that starts after a triplet ATG and ends before a triplet TAG, TAA, or TGA.
Furthermore, the length of a gene string is a multiple of 3 and the gene does not contain any of the triplets ATG, TAG, TAA and TGA.
My desired result is:
>>Enter a genome string:>>TTATGTTTTAAGGATGGGGCGTTAGTT
Output:
>>TTT
>>GGGCGT
>>Enter a genome string:>>TGTGTGTATAT
>>No gene is found
So far I have got:
import re
def findGene(gene):
  pattern = re.compile(r'ATG((?:[ACTG]{3})*?)(?:TAG|TAA|TGA)')
  return pattern.findall(gene)
  findGene('TTATGTTTTAAGGATGGGGCGTTAGTT')
def main():
  geneinput = input("Enter a genome string: ")
  print(findGene(geneinput))
main()
# TTATGTTTTAAGGATGGGGCGTTAGTT
How can I make this code work properly?
Thank you.
 
    