I'm struggling with Scrapy and I don't understand how exactly passing items between callbacks works. Maybe somebody could help me.
I'm looking into http://doc.scrapy.org/en/latest/topics/request-response.html#passing-additional-data-to-callback-functions
def parse_page1(self, response):
    item = MyItem()
    item['main_url'] = response.url
    request = scrapy.Request("http://www.example.com/some_page.html",
                             callback=self.parse_page2)
    request.meta['item'] = item
    return request
def parse_page2(self, response):
    item = response.meta['item']
    item['other_url'] = response.url
    return item
I'm trying to understand flow of actions there, step by step:
[parse_page1]
- item = MyItem()<- object item is created
- item['main_url'] = response.url<- we are assigning value to main_url of object item
- request = scrapy.Request("http://www.example.com/some_page.html", callback=self.parse_page2)<- we are requesting a new page and launching parse_page2 to scrap it.
[parse_page2]
- item = response.meta['item']<- I don't understand here. We are creating a new object item or this is the object item created in [parse_page1]? And what response.meta['item'] does mean? We pass to the request in 3 only information like link and callback we didn't add any additional arguments to which we could refer ...
- item['other_url'] = response.url<- we are assigning value to other_url of object item
- return item<- we are returning item object as a result of request
[parse_page1]
- request.meta['item'] = item<- We are assigning object item to request? But request is finished, callback already returned item in 6 ????
- return request<- we are getting results of request, so item from 6, am I right?
I went through all documentation concerning scrapy and request/response/meta but still I don't understand what is happening here in points 4 and 7.
 
     
    