I'm halfway through writing a scraper with Scrapy and am worried that its asynchronous behaviour may result in issues.
I start on a page that has several links a from each of which I get an x. These x are saved (downloaded). Then I go to another page b where I use some info I got from one of the a links (it is constant for all of them) to select and download y.
Then I "pair" x and y, how I pair them is not important what matters is just that x and y both exist (are downloaded).
Now I would consider my starting page (start_urls) processed, and I would get the link to 'turn' the page (as in I'm on page 1 and am now going to page 2), which I then Request to start the process from the beginning.
The code looks roughly like this :
# ..imports, class etc.
start_url = ['bla']
start_url_buddy = ['bli']
def parse(self, response):
    urls = response.xpath(...)
    for url in urls:
        yield scrapy.Request(url, callback=self.parse_child)
    yield scrapy.Request(start_url_buddy, callback=self.parse_buddy)
    pair_function(self.info)
    # Finished processing start page. Now turning the page.
    # could do smth like this to get next page:
    nextpage_url = response.xpath(...@href)
    yield scrapy.Request(nextpage_url)
    # or maybe something like this?
    start_urls.append(response.xpath(...@href))
# links `a`
def parse_child(self, response):
    # info for processing link `b`
    self.info = response.xpath(...)
    # Download link
    x = response.xpath(...)
    # urlopen etc. write x to file in central dir
# link `b`
def parse_buddy(self, response):
    # Download link
    y = response.xpath(...self.info...)
    # urlopen etc. write y to file in central dir
I haven't gotten to the page turning part yet and am worried whether that will work as intended (I'm fiddling with the merge function atm, getting xs and y works fine for one page). I don't care in what order the xs and y are gotten as long as it's before pair_function and 'turning the page' (when the parse function should be again). 
I have looked at a couple other SO questions like this but I haven't been able to get a clear answer from them. My basic problem is I'm unsure as to how exactly the asynchronicity is implemented (it doesn't seem to explained in the docs?).
EDIT: To be clear what I'm scared will happen is that yield scrapy.Request(nextpage_url) will be called before the previous one's have gone through. I'm now thinking I can maybe safe guard against that by just appending to start_urls (as I've done in the code) after everything has been done (the logic being this should result in the parse function being called on the appended url?
 
     
    