I'm using a Virtual Machine (16vCPU, 32GB Ram, 100GB disk size) on Compute Engine, with the specs mentioned below. As far I understand it, the machine has 8 cores with each the ability to run 2 threads at the same time - giving 16 threads in total.
What I am doing:
- I am querying a docker service from a python client. The input is a PDF, the output a parsed file. The Docker service has a concurrency of 15.
- I am running 15 threads in the ThreadPoolExecutor (I pasted the specific lines below)
The issue :
- Out of 1800 requests that I made, only 970 of them - barely more than half - actually succeeded. The rest timed out with a 408 error.
- I know other performance parameters could affect the timeout - but the machine is a fairly robust one - and the tasks run on my local machine which is much less powerful with less timeouts.
What I tried to fix it:
- I've tried lowering the number of threads, but still getting a significant amount of timeout. I thought the bottleneck might be the Docker service - but given I don't have any enforced limits on the container - it should take up the resources available.
Any idea what might be the root cause for this issue in my setup ? How could I solve this ?
Machine Specs (lscpu)
Architecture:                    x86_64
CPU op-mode(s):                  32-bit, 64-bit
Byte Order:                      Little Endian
Address sizes:                   46 bits physical, 48 bits virtual
CPU(s):                          8
On-line CPU(s) list:             0-7
Thread(s) per core:              2
Core(s) per socket:              4
Socket(s):                       1
NUMA node(s):                    1
Vendor ID:                       GenuineIntel
CPU family:                      6
Model:                           79
Model name:                      Intel(R) Xeon(R) CPU @ 2.20GHz
Stepping:                        0
CPU MHz:                         2200.208
BogoMIPS:                        4400.41
Thread Pool (lines taken from workable script to illustrate)
 with concurrent.futures.ThreadPoolExecutor(max_workers=15) as executor:
        results = []
        for input_file in input_files:
            selected_process = self.process_pdf
            r = executor.submit(
                selected_process
                )
            results.append(r)
    for r in concurrent.futures.as_completed(results):
        input_file, status, text = r.result()
