I have spent some time to find the best and fastest way of getting status codes of a huge URL list but no progress.
Here is my code:
import multiprocessing
import time
def check(url):
    """Send request to url and get a HTTP status code"""
    try:
        response = requests.head(url)
    except requests.exceptions.RequestException:
        return "404"
    return str(response.status_code)
def multiprocessing_func():
    url_list = [
        # A huge list of URLs
    ]
    pool = multiprocessing.Pool()
    start = time.time()
    pool.map(check, url_list)
    done = time.time()
    print("time: {}".format(done - start))
My laptop is a little bit slow, however:
when url_list has 1 URL, it takes 6 seconds to be done,
with 8 items, it takes 10 seconds,
with 32 items, it takes 24 seconds,
with 128 items, it takes 77 seconds and so on...
Why this time is growing in multiprocessing?
I think it should take nearly 6 or 7 seconds to be done(nearly the same amount of one URL).
What did I do wrong?
How can I do this in the fastest way (suppose I have a list with 10000 URLs)?
any suggestion would be appreciated.
best regards.
 
     
    