I'd like to search text on a lot of websites at once. From what I understand, of which I don't know if I understand correctly, this code shouldn't work well.
from twisted.python.threadpool import ThreadPool
from txrequests import Session
pages = ['www.url1.com', 'www.url2.com']
with Session(pool=ThreadPool(maxthreads=10)) as sesh:
   for pagelink in pages:
       newresponse = sesh.get(pagelink)
       npages, text = text_from_html(newresponse.content)
My relevant functions are below (from this post), I don't think their exact contents (text extraction) are important but I'll list them just them in case.
def tag_visible(element):
    if element.parent.name in ['style', 'script', 'head', 'title', 'meta', '[document]']:
        return False
    if isinstance(element, Comment):
        return False
    return True
def text_from_html(body):
    soup = BeautifulSoup(body, 'html.parser')
    texts = soup.findAll(text=True)
    visible_texts = filter(tag_visible, texts)  
    return soup.find_all('a'), u" ".join(t.strip() for t in visible_texts)
If I was executing this sesh.get() without the extra functions below, from what I understand: a request is sent, and perhaps it might take a while to come back with a response. In the time that it takes for this response to come, other requests are sent; and perhaps responded to before some prior requests.
If I was only making requests within my for loop, this would happen as planned. But if I put functions within the for loop, am I stopping the requests from being asynchronous? Wouldn't the functions wait for the response in order to be executed? How can I smoothly execute something like this?
I'd also be open to suggestions of different modules, I have no particular attachment to this one - I think it does what I need it to, but I realize it's not necessarily the only module that can.
