I have a file gff3.txt with this kind of datas (billions of lines):
 scaffold1000|size145372 . gene 16987 23149 . - . ID=evm.TU.scaffold1000|size145372.2;Name=EVM%20prediction%20scaffold1000|size145372.2
 scaffold1000|size145372 . mRNA 16987 23149 . - . ID=evm.model.scaffold1000|size145372.2;Parent=evm.TU.scaffold1000|size145372.2;Name=EVM%20prediction%20scaffold1000|size145372.2
 scaffold1000|size145372 . exon 22965 23149 . - . ID=evm.model.scaffold1000|size145372.2.exon1;Parent=evm.model.scaffold1000|size145372.2
 scaffold9|size467357 . gene 373475 396789 . + . ID=evm.TU.scaffold9|size467357.56;Name=EVM%20prediction%20scaffold9|size467357.56
 scaffold9|size467357 . mRNA 373475 396789 . + . ID=evm.model.scaffold9|size467357.56;Parent=evm.TU.scaffold9|size467357.56;Name=EVM%20prediction%20scaffold9|size467357.56
 scaffold9|size467357 . exon 373475 373695 . + . ID=evm.model.scaffold9|size467357.56.exon1;Parent=evm.model.scaffold9|size467357.56
 ...
And an other file `position.txt (billions of lines):
 scaffold1000|size145372.2  scaffold1000|size145372:16987-23149
 scaffold9|size467357.56    scaffold10008|size45161:373475-396789
 ...
And I search to obtain this:
 scaffold1000|size145372 . gene 16987 23149 . - . ID=evm.TU.scaffold1000|size145372:16987-23149;Name=EVM%20prediction%20scaffold1000|size145372:16987-23149
 scaffold1000|size145372 . mRNA 16987 23149 . - . ID=evm.model.scaffold1000|size145372:16987-23149;Parent=evm.TU.scaffold1000|size145372:16987-23149;Name=EVM%20prediction%20scaffold1000|size145372:16987-23149
 scaffold1000|size145372 . exon 22965 23149 . - . ID=evm.model.scaffold1000|size145372:16987-23149.exon1;Parent=evm.model.scaffold1000|size145372:16987-23149
 scaffold9|size467357 . gene 373475 396789 . + . ID=evm.TU.scaffold10008|size45161:373475-396789;Name=EVM%20prediction%20scaffold10008|size45161:373475-396789
 scaffold9|size467357 . mRNA 373475 396789 . + . ID=evm.model.scaffold10008|size45161:373475-396789;Parent=evm.TU.scaffold10008|size45161:373475-396789;Name=EVM%20prediction%20scaffold10008|size45161:373475-396789
 scaffold9|size467357 . exon 373475 373695 . + . ID=evm.model.scaffold10008|size45161:373475-396789.exon1;Parent=evm.model.scaffold10008|size45161:373475-396789
 ...
So I would like to find in the column $9 of the gff3.txt file the patterns that match with the column $1 in position.txt and then change them with the pattern of the column 2 of the position.txt file.
I tried with awk:
 awk '
     NR==FNR{a[$9]
     next
 }
 ($2 in a) {
     print
 }' gff3.txt position.txt > output.txt
But this didn't work. Maybe is due to because of the patterns in the column $9 of the gff3.txt are included in other information?
I also try to adapt these threads with my datas but I didn't achieve it: stackoverflow1, stackoverflow2, stackoverflow3, stackExchange...
Any advice for coding this in awk, sed or others will be very appreciated.
 
     
     
    