I have a function which reads in a file, compares a record in that file to a record in another file and depending on a rule, appends a record from the file to one of two lists.
I have an empty list for adding matched results to:
match = []
I have a list restrictions that I want to compare records in a series of files with.
I have a function for reading in the file I wish to see if contains any matches. If there is a match, I append the record to the match list.
def link_match(file):
    links = json.load(file)
    for link in links:
        found = False
        try:
            for other_link in other_links:
                if link['data'] == other_link['data']:
                    match.append(link)
                    found = True
                else:
                    pass
        else:
            print "not found"
I have numerous files that I wish to compare and I thus wish to use the multiprocessing library.
I create a list of file names to act as function arguments:
list_files=[]
for file in glob.glob("/path/*.json"):
    list_files.append(file)
I then use the map feature to call the function with the different input files:
if __name__ == '__main__':
    pool = multiprocessing.Pool(processes=6)
    pool.map(link_match,list_files)
    pool.close()
    pool.join()
CPU use goes through the roof and by adding in a print line to the function loop I can see that matches are being found and the function is behaving correctly.
However, the match results list remains empty. What am I doing wrong?
 
     
     
     
     
    