I wrote this to retrieve FASTA sequences of a list of protein accession numbers (acc_list.txt), each on a new line, and write them to a txt file (prot_list).
x=0
with open("acc_list.txt","r") as input:
    number = sum(1 for items in input) ###
with open("acc_list.txt","r") as input:
    with open ("prot_list.txt","w") as output:
        for acc in input:
            handle = Entrez.efetch(db="protein",id=acc,rettype="fasta")
            x+=1
            print("Dealing with", str(acc.strip()), str(x), "out of", str(number), sep=" ")
            output.write(handle.read())
It is a big list so the penultimate line gives me an idea of the progress.
As you can see, number = sum(1 for items in input) gives the total number of lines, but I have to open and close the file separate, because if I put that under the latter with statement, i.e.
x=0
with open("acc_list.txt","r") as input:
    with open ("prot_list.txt","w") as output:
        for acc in input:
            number = sum(1 for items in input) ###
            handle = Entrez.efetch(db="protein",id=acc,rettype="fasta")
            x+=1
            print("Dealing with", str(acc.strip()), str(x), "out of", str(number), sep=" ")
            output.write(handle.read())
it stops after counting items and gives no other outputs.
I'm guessing this is because number = sum(1 for items in input) iterates through the file and ends the iteration too.
I am curious as to whether there is a more efficient way to obtain the number of lines in a file? I can imagine that if I work with an even bigger list, there may be problems with my approach. I've seen older answers and they all involve iterating through the file first.