How does the flow of apply_async work between calling the iterable (?) function and the callback function?
Setup: I am reading some lines of all the files inside a 2000 file directory, some with millions of lines, some with only a few. Some header/formatting/date data is extracted to charecterize each file. This is done on a 16 CPU machine, so it made sense to multiprocess it.
Currently, the expected result is being sent to a list (ahlala) so I can print it out; later, this will be written to *.csv. This is a simplified version of my code, originally based off this extremely helpful post.
import multiprocessing as mp
def dirwalker(directory):
  ahlala = []
  # X() reads files and grabs lines, calls helper function to calculate
  # info, and returns stuff to the callback function
  def X(f): 
    fileinfo = Z(arr_of_lines) 
    return fileinfo 
  # Y() reads other types of files and does the same thing
  def Y(f): 
    fileinfo = Z(arr_of_lines)
    return fileinfo
  # results() is the callback function
  def results(r):
    ahlala.extend(r) # or .append, haven't yet decided
  # helper function
  def Z(arr):
    return fileinfo # to X() or Y()!
  for _,_,files in os.walk(directory):
    pool = mp.Pool(mp.cpu_count()
    for f in files:
      if (filetype(f) == filetypeX): 
        pool.apply_async(X, args=(f,), callback=results)
      elif (filetype(f) == filetypeY): 
        pool.apply_async(Y, args=(f,), callback=results)
  pool.close(); pool.join()
  return ahlala
Note, the code works if I put all of Z(), the helper function, into either X(), Y(), or results(), but is this either repetitive or possibly slower than possible? I know that the callback function is called for every function call, but when is the callback function called?  Is it after pool.apply_async()...finishes all the jobs for the processes? Shouldn't it be faster if these helper functions were called within the scope (?) of the first function pool.apply_async() takes (in this case, X())? If not, should I just put the helper function in results()?
Other related ideas: Are daemon processes why nothing shows up? I am also very confused about how to queue things, and if this is the problem. This seems like a place to start learning it, but can queuing be safely ignored when using apply_async, or only at a noticable time inefficiency?