I am trying to get the paths of a group of files that I have in a list. The files are in different subfolders. I am using os.walk and loops to run through the different files and appending the complete path to a new dataframe to use in a different program. But there is an error in the code that only makes it run the first cycle of the loop.
the code is based on this thread: Need the path for particular files using os.walk()
I am using python3.6 on MacOS10.14.6
I am not sure if it matters but the directories are on an external hard drive.
    import pandas as pd
    import os
    dir = "/Volumes/dir1/dir2"
    fastafiles = ["file1", "file2", "file3"]
    fastafiles_df = pd.DataFrame(fastafiles)
    fasta_paths = []
    for fasta in fastafiles_df[0]:
        #1
        for dir, subdirs, files in os.walk(dir):
            for file in files:
                if file.endswith(fasta):
                    #2
                    fasta_paths.append(os.path.join(dir, file))
                    #3
Running the code will give me 1 entry in fasta_paths with only the path of the first file.
If I print(fasta) at #1 I get all 3 file names from my dataframe.
If I print(file) at #2 I will get only 1 file name
and if I print fasta_paths at #3 I will get the path of the first file.
Could someone point out why the loop does not continue.
 
     
    