I have a large read-only data structure (a graph loaded in networkx, though this shouldn't be important) that I use in my web service. The webservice is built in Flask and then served through Gunicorn. Turns out that for every gunicorn worker I spin up, that worked holds its own copy of my data-structure. Thus, my ~700mb data structure which is perfectly manageable with one worker turns into a pretty big memory hog when I have 8 of them running. Is there any way I can share this data structure between gunicorn processes so I don't have to waste so much memory?
            Asked
            
        
        
            Active
            
        
            Viewed 2.4k times
        
    55
            
            
        - 
                    1Have you considered using something like Redis to store the data and access it from each process? Would be very similar to shared memory as far as speed goes. – nathancahill Dec 02 '14 at 01:45
- 
                    I would, but we're talking about a complex graph that there's no easy way to store in Redis (Redis has no directed edge graphs or general graph support currently AFAIK). – Eli Dec 02 '14 at 01:55
- 
                    2Did the solution work for you? If yes can you le me know in detail, how you did it? – neel Mar 11 '16 at 06:28
1 Answers
28
            It looks like the easiest way to do this is to tell gunicorn to preload your application using the preload_app option.  This assumes that you can load the data structure as a module-level variable:
from flask import Flask
from your.application import CustomDataStructure
CUSTOM_DATA_STRUCTURE = CustomDataStructure('/data/lives/here')
# @app.routes, etc.
Alternatively, you could use a memory-mapped file (if you can wrap the shared memory with your custom data structure), gevent with gunicorn to ensure that you're only using one process, or the multi-processing module to spin up your own data-structure server which you connect to using IPC.
 
    
    
        Community
        
- 1
- 1
 
    
    
        Sean Vieira
        
- 155,703
- 32
- 311
- 293
- 
                    1preload option is not working, can you provide some example of how to use it with some dummy data structure? – neel Mar 10 '16 at 07:02
- 
                    @neel - you're probably better off asking another question with an example of your setup and what's not working. – Sean Vieira Mar 10 '16 at 15:39
- 
                    1I have posted the question here http://stackoverflow.com/questions/35914587/how-to-get-a-concurreny-of-1000-requests-with-flask-and-gunicorn It would be great if you look at it once. Thanks in advance. – neel Mar 10 '16 at 15:44
- 
                    A great read, although didn't help me setup catch the parent process while using a Uvicorn worker, but I managed to stumble upon a solution that I think is even cleaner than the preload method, and it's using a python config file for gunicorn. `-c gconfig.py` – aliqandil Dec 20 '20 at 06:20
