I have a large DAT file of this type:
//
AC  T00020
OS  rat, Rattus norvegicus
BS  R02959; HS$APOA1_02; Quality: 6; APOA1, G000203; human, Homo sapiens.
I have a large dat file of this type:
//
AC  T00024
OS  rat, Rattus norvegicus
BS  R00135; HS$APOA1_01; Quality: 6; APOA1, G000203; human, Homo sapiens.
//
AC  T00025
OS  human, Homo sapiens
BS  R02119; ANF$CONS_01; Quality: 4.
BS  R02333; MOUSE$ALBU_12; Quality: 6; Alb, G000464; mouse, Mus musculus.
BS  R02334; MOUSE$ALBU_13; Quality: 6; Alb, G000464; mouse, Mus musculus.
//
AC  T00027
OS  clawed frog, Xenopus
BS  R02120; AP1$CONS; Quality: 6.
//
I first want to break it in modules where it starts and ends with '//' Then I want to keep only those modules having 'OS human, HomosSapiens' in them.
I am writing a python script to achieve this, but i am not able to break it in modules yet. I am trying it in Python 3.
Finally i want to keep this part of the dat file:
AC  T00025
OS  human, Homo sapiens
BS  R02119; ANF$CONS_01; Quality: 4.
BS  R02333; MOUSE$ALBU_12; Quality: 6; Alb, G000464; mouse, Mus musculus.
BS  R02334; MOUSE$ALBU_13; Quality: 6; Alb, G000464; mouse, Mus musculus.
 
     
     
    