Is there a way to share a huge dictionary to multiprocessing Subprocesses on windows without duplicating the whole memory? I only need it read-only within the sub-processes, if that helps.
My programm roughly looks like this:
def workerFunc(args):
    id, data_mp, some_more_args = args
    # Do some logic
    # Parse some files on the disk
    # and access some random keys from data_mp which are only known after parsing those files on disk ...
    some_keys = [some_random_ids...]
    # Do something with 
    do_something = [data_mp[x] for x in some_keys]
    return do_something
if __name__ == "__main__":
    multiprocessing.freeze_support()    # Using this script as a PyInstalled .exe later on ...
    DATA = readpickle('my_pickle.pkl')   # my_pickle.pkl is huge, ~1GB
    # DATA looks like this:
    # {1: ['some text', SOME_1D_OR_2D_LIST...[[1,2,3], [123...]]], 
    #  2: ..., 
    #  3: ..., ..., 
    #  1 million keys... }
    # Here I'm doing something with DATA in the main programm...
    # Then I want to spawn N multiprocessing subprocesses, each doing some logic and than accessing a few keys of DATA to read from ...
    manager = multiprocessing.Manager()
    data_mp = manager.dict(DATA)    # Right now I'm putting DATA into the shared memory... so it effectively duplicates the required memory...
    joblist = []
    for idx in range(10000): # Generate the workers, pass the shared memory link data_mp to each worker later on ...
        joblist.append((idx, data_mp, some_more_args))
    # Start Pool of Procs... 
    p = multiprocessing.Pool()
    returnNodes = []
    for ret in p.imap_unordered(workerFunc, jobList):
       returnNodes.append(ret)
    # Do some after work with DATA and returnNodes...
    # and generate some overview xls-file out of it
Unfortunately there's no other way to save my big dictionary... I know a SQL Database would be better because each worker only accesses a few keys of DATA_mp within his subproc, but I don't know in advance which keys will be adressed by each worker.
So my question is: Is there any other way on windows to do this instead of using a Manager.dict() which, as stated above already, effectively duplicates the required memory?
Thanks!
EDIT Unfortunately in my corporate environment, there's no possibility for my tool to use a SQL DB because there's no dedicated machine available. I can only work on file-basis on networkdrives. I tried SQLite already, but it was seriously slow (even though I didnt understand why...). Yes it's a simple key->value kind of dictionary in DATA...
And using Python 2.7!