I am new in Scrapy and didn't found any help so far.
I want to make a small scraper that can scrape all the url's on the page and then hit them one by one and if Url returns any down-loadable file of any extension then download it and save it into specified location. Here's the code that I have written : items.py
import scrapy
class ZcrawlerItem(scrapy.Item):
    file = scrapy.Field()
    file_url = scrapy.Field()
spider.py
from scrapy import Selector
from scrapy.spiders import CrawlSpider, Rule
from scrapy.http import Request
DOMAIN = 'example.com'
URL = 'http://%s' % DOMAIN
from crawler.items import CrawlerItem
class MycrawlerSpider(CrawlSpider):
    name = "mycrawler"
    allowed_domains = [DOMAIN]
    start_urls = [
        URL
    ]
    def parse_dir_contents(self, response):
        print(response.headers)
        item = CrawlerItem()
        item['file_url'] = response.url
        return item       
    def parse(self, response):
        hxs = Selector(response)
        for url in hxs.xpath('//a/@href').extract():
            if (url.startswith('http://') or url.startswith('https://')):
                yield Request(url, callback=self.parse_dir_contents)
        for url in hxs.xpath('//iframe/@src ').extract():
            yield Request(url, callback=self.parse_dir_contents)
The issues that I am facing are the parse_dir_contents not showing header, So it's become difficult to check whether the response data is any down-loadable file or just a content.
BTW I am using Scrapy 1.1.0 and Python 3.4
Any help would be really appreciated!!
 
    